
LIBRARY 


New Delhi 


Call No- 




PROCEEDINGS 


HI* I UK 

ROYAL SOCIETY OF EDINBURGH 


Section A (Mathematical ami Physical Sciences) 


Vol. LXIJ - 1941-42 [Part II 


CONTENTS 

NO PAGE 

XI Ouantum '11 iron of the Chemical Bond. By C. A. 
CoULSoN. University ( 'ollege, Dundee. An Add)ess 
delivered bej'oie the Satiety on February 3, 1941- (With 
Eighteen Text-fig urea; • • • * * ll S 

(Is \ nett sepu, alety Pet ember 1 1 >, 1 941 ■) 

XII. Some Remarks occasioned by the (leumetry of the Veronese 
Surface. By V\\ L Kdcsb, Mathematical Institute, 
University of Edinburgh ... • r 4 ° 

(faired \epatafeiy December 10 , 1041 .) 

XIII. Sonus Disputed Ouestious in the Philosophy of the Physical 

Sciences. By K. T. WiHTTAKKR, K.R.S, P. R.S.K. 
(Address of the President at the Annual Statutory 
Mwtingi Oetober 27, i 94 r ) ..... 160 

{Issued stparalely janmny i<h 1942) 

XIV. Further Investigations in Factor Estimation. By D. N. 

LAWLEY, B.A., Moray House, University of Edinburgh. 
Communicated by Professor GODFREY H. 1 ITOMSON . 176 

(hsued separately Jamary 26,. r 943 -) 


[Continued oh pag* w of Conor 


-PUBLISHED BY 

OLIVER & BOYD 

EDINBURGH: TWEEDDALE COURT 
LONDON: 98 GREAT RUSSELL STREET, W.C.I 


Price 6s. 6d , 



ROYAL SOCIETY OF EDINBURGH 

22, 24 GEORGE STREET, EDINBURGH, 2 


THE PREPARATION FOR. PUBLICATION OF PAPERS IN 
THE TRANSACTIONS AND IN THE PROCEEDINGS 
(SECTIONS “A” AND “B”) OF THE SOCIETY. 

In view of the national necessity for exercising the strictest economy in 
paper and the high cost of publication, authors of papers are requested 
to write their communications in as concise a form as possible and to 
avoid excess of tables and illustrations. 

An author is advised to retain a copy of his paper, as the Society 
cannot undertake any responsibility in relation to the custody of papers 
entrusted to it. The MS. must be easily legible, preferably typewritten 
on one side of quarto or foolscap paper and with pages numbered. It 
must be absolutely in its final form for printing. A short summary of 
the important points in the paper should be given. A table of contents 
(for a long paper), references to plates, etc., must be in their proper 
places, and positions indicated for the insertion of illustrations that are 
to appear in the text. Names of genera and species should be in italics. 
Footnotes should be avoided. 

Additions to a paper after it has been finally handed in for publica¬ 
tion will, if accepted by the Council, be treated and dated as separate 
communications, and may, or may not, be printed immediately after the 
original paper. 

References to literature should be placed at the end of the paper, 
alphabetically arranged, under authors’ names, with abridged titles of 
journals, thus:— 

Sanbeman, I. f 1929. “The Fulcher Bands of Hydrogen,” Proc . 
_ Roy . Soc. Edin. y vol. xlix, pp. 48-64. 

Whittaker, E. T., and Robinson, G,, 1923. A Short Course in ~ 
Interpolation , London. 

Titles of papers should be quoted exactly, and all references to litera¬ 
ture should be carefully checked by the authors before submitting the 
paper. References to literature in the text should be made by quoting 
the author’s name and the year of publication thus (Sandeman, 1929) or 
(Whittaker and Robinson, 1923), and adding the page when necessary. 

All illustrations must be in a form immediately suitable for repro¬ 
duction, preferably of a size to permit reduction to about two-thirds the 
linear dimensions of the original, and should be capable of reproduction 
by photographic processes. Drawings and diagrams to be reproduced as 
line blocks should be made with fixed Indian ink, preferably on fine 
white bristol board, free from folds or creases; smooth clean lines or 
sharp dots, but no washes or colours, should be used. Graphs should be 
on a squared paper ruled in faint blue lines, unless the lines are to be 
brought out. If the illustrations are on a large scale to be afterwards 
reduced by photography, any lettering must be on a corresponding scale. 


Continued on page in of Cover 



Quantum Theory of the Chemical Bond 


US 


XI.— Quantum Theory of the Chemical Bond. By C. A. Coulson, 

University College, Dundee. An Address delivered before the 
Society on February 3, 2941. (With Eighteen Text-figures.) 

(MS. received June 11, 1941.) 

1. Introduction. 

The wave mechanics introduced in 1926 by Schrodinger has proved 
very fruitful in explaining many features of atomic structure, and in 
practically every important respect our knowledge of the motions and 
energies of the electrons which move outside the nucleus of an atom is 
satisfactory. The same is not true of molecular structure: the mathe¬ 
matical complications involved in a precise detailed description of the 
orbits are so great that they have been overcome only in the case of 
one molecule—molecular hydrogen. So in this review of some of the 
investigations of the last few years we shall usually be speaking in terms 
of approximations; indeed right at the start we meet two different main 
avenues of approximation, known as the molecular orbital and electron- 
pair methods respectively. We confine ourselves here to the former of 
these, the molecular orbital approximation, not because it is the better 
(neither is satisfactory, and the existence of the two complementary 
approximations is an indication of our partial failure to solve the problem), 
but because it is easier to maintain one point of view consistently through¬ 
out this account. The molecular orbital method is associated particularly 
with the names of Hund, Hiickel, Mulliken, and Lennard-Jones, and the 
electron-pair method with the names of Heitler, London, Pauling, Slater, 
and Van Vleck. 

2. Atomic Orbits. 

Since molecules are formed out of atoms and, as we shall see, 
molecular orbits are themselves compounded out of atomic orbits, it will be 
convenient to review briefly the wave mechanical description of electrons 
in an atom. According to the theory initiated by Hartree, the motions 
of these electrons are governed by the following three principles:— 

(a) Each electron is assigned to a definite orbit, which is described 
by a wave function i//(x } y, a), in which y , 2 are the co-ordinates of the 
electron relative to the nucleus, and i/t itself has to be one of a definite 
set of allowed functions, which, since they define the orbit, may be called 

P.R.S.E.—VOL. LXI, A, I94I-42, PART II 8 



n6 


C. A. Coulson 


Atomic Orbitals. i/> is found as the solution of a certain differential 
equation, the Schrddinger Wave Equation. ijj“(x, y, s) or, if >/> is complex, 
I K x >y> z ) 1 2 represents the probability that the electron will be found 
in a certain region of space around (pc, y, s). If ift is large at any point 
irrespective of its sign (which has no physical meaning), the electron is 
likely to be found in that region. 

(b) Each of these wave functions ifi has its own appropriate energy 




Fig* i,— Atomic s and p orbits as found in Hydrogen and Carbon. 


value and, if this is suitably determined, the total energy is merely the 
sum of the energies of the constituent orbitals. 

( c ) In addition to its wave function i/j(x , y, z ), an electron has a spin, 
which must have one of two values (±| in suitable units), and the Pauli 
Exclusion Principle tells us that not more than two electrons in any atom 
(or molecule) may have the same orbit; further, if there are two, they must 
have their spins different, or opposed to each other. Two such electrons 
having the same $ but opposed spins form a very stable combination 
(we speak of them as paired together); they are unreactive and usually 
exert a repulsion upon any other electrons near them. 

We shall be mainly concerned in this account with Hydrogen and 
Carbon* Fig. r (after Pauling) shows a rough drawing of the electron 
patterns in these atoms. In Hydrogen there is just one electron around 
the central nucleus and this electron has a spherically symmetrical 



Quantum Theory of the Chemical Bond 11 7 

pattern, similar to that marked i*. It is known as the is orbital, and is 
characterised by the fact that the electron is most likely to be found 
around or inside the sphere shown (whose radius is approximately the 
atomic radius of the atom). In Carbon, with six electrons, there are 
two inner or K-shell orbits which are similar to the Hydrogen is orbit, 
but much more contracted into the nucleus; these take no part in 
chemical binding and we need not concern ourselves with them. Next 
there are four L-shell electrons, whose orbits must be ‘chosen from 2s, 
2 p Xy 2 Pv> and 2 P%' The 2 s orbit is spherically symmetrical, and only 
slightly smaller than the Hydrogen is orbit; in the other three, shown 
as p X) p y , and p z in the figure, which are also about the same size as 
the 2 s orbit, the electron is practically confined within a region which 
resembles a dumb-bell; on account of this fact we may conveniently 
refer to these as “ dumb-bell ” orbits. They have a marked directional 
property; in fact ift is positive in one of the preferred regions and 
negative in the other, having what is called a nodal plane, or zero value, 
between the two. The analytical form of the 2 p x wave function, for 
example, is i/j( 2 p x )=xe~ cr , where c is a known constant and r is the 
distance from the nucleus; this shows that #=o is a plane of zero value. 

The precise way in which these four orbits are filled depends upon 
the nature of the bonds that Carbon is forming in any particular molecule. 
For since all three “dumb-bell” orbits have the same energy, we may 
take any linear combinations of them instead of 2 p x , 2p yt and 2 p z . 
Further, if the energy of the 2 s orbital does not differ greatly from that 
of the 2 p orbitals (the energy of binding of the 2s orbit is greater than 
that of the 2 p orbits), we may include the 2 s wave function in the new 
compound, or hybrid, orbitals. If the energies are considerably different, 
as in Oxygen, this hybridisation does not take place to any large extent. 
But in Carbon it happens quite easily, and there are at least two linear 
combinations of the 2 s and 2 p wave functions that are of the utmost 
importance. They are: 

Type A (tetrahedral orbits). Type B (trigonal orbits). 

p(ti )» p( 2 s) + ifj( 2 \pP) + p(zpy) + p( 2 Ps) PU) “ + V 2 $( 2 Pz) 

lfs(t 2 ) *= lfs(2S) + lft(2fi x ) - lfs(2p y ) - lfj(2p s ) lff(I 7 ) = 'S/2lfs{2s) - lfj{2p x ) + V 3 <£( 2 P V ) 

l/j(t 2 ) = p(2S) ~ p(2p x ) + p(2p y ) ~ $(2p z ) \fj(III) - V 2lfj(2S) ~ lfj(2p x ) - \l3p( 2 Py) 

lf/(t 4 ) « lf*(2$) ™ lfj(2pP) ~ lfj(2f y ) + lfj{2p s ) P(IV)= *p( 2 Pz) 

Type A we may call the tetrahedral type, because all four wave 
patterns are identical, with a form resembling that shown in fig. 2; in 
one of these orbits the electron is very concentrated in a particular 
direction, and the wave pattern is symmetrical about this direction; 
there is little probability of finding the electron anywhere else than in 



ii8 


C. A. Coulson 


this region. A very important property of these orbits is that they point 
towards the four vertices of a regular tetrahedron surrounding the 
Carbon atom. We are not surprised, therefore, to discover that the 
characteristic tetravalency of Carbon is associated with the fact that, 
when it is ready for the formation of a saturated molecule such as CH 4 
or C 2 H e , we find one electron in each of the four orbits t x . • . z? 4 . 

The second type, which we may call the trigonal type, resembles A 
in that the first three of the hybrid orbitals are very similar to the tetra¬ 
hedral functions, having a marked directional property; but these three 


r\ 




C7 


^ 1 ^ 



Fig. 2.—A single tetrahedral orbital. Fig. 3.—The three co-planar trigonal orbitals. 


principal directions are all in a plane (the xy plane), and the angles 
between them are 120°. These are shown in fig. 3. The remaining orbit 
of this, series is an undisturbed “dumb-bell” orbit, with its axis per¬ 
pendicular to the plane of the other three. Thus in this configuration 
there are three equivalent directions in the plane, and we are not surprised 
to learn that when the Carbon atom is about to form an unsaturated 
molecule, siuch as Ethylene C 2 H 4) in which the angles between the bonds 
are 120®, there is one electron in each of the trigonal orbits / ... TV. 

, . ' 3- The Normal Single Bond. 

When we pass from studying the electrons in atoms to discuss the 
formation of molecules, we have to consider how the electrons will move 
in the presence, not of one, but of two or more nuclei, The three 
principles (a), (b), and (c) which we outlined in § 2 for the atomic orbits 
still hold for molecular orbits. That is: < 

(a) (Each electron is assigned a wave function *fi which defines the 
.\caikiW functions* or. molecular orbitals, as they are called, 



Quantum Theory of the Chemical Bond 119 

to distinguish them from the atomic orbitals , are no longer monocentric, 
but are polycentric, since an electron which takes part in molecule 
formation is not confined to one nucleus. */r 2 (#, y, ss) still represents the 
probability of finding the particular electron around the point (x } y , 2). 

( b ) Each iff is associated with a definite energy value. 

(c) Each electron has a spin, and obeys the Pauli Exclusion Principle, 
just as in the case of an atom. 

To these three principles we may add three more: 

( d ) In the neighbourhood of any nucleus a molecular orbit resembles 
one of the allowed atomic orbits. This follows from the fact that when 
the electron is near a particular nucleus, the chief forces on it arise from 
this nucleus and the other electrons round this nucleus, so that the 
solution of the wave equation in this neighbourhood will resemble an 
atomic orbital. In this sense, then, molecular orbitals may be regarded 
as built up out of atomic orbitals. 

(e) The energy of a molecular orbit is lowest (i.e. binding energy is 
greatest) when the atomic orbits which it resembles overlap as much as 
possible. This is the Principle of Maximum Overlapping and, as we 
shall see, immediately leads us to the fundamental results of stereo¬ 
chemistry. 

(/) The five principles ( \a)-(e ) hold for all molecular orbits. However, 
when we are dealing with molecules in which only single bonds occur, 
it is possible to show, though we shall not reproduce the analysis here, 
that a good enough approximation to the molecular orbits is obtained by 
supposing that they are bicentric; that is, they embrace two neighbouring 
nuclei only. Strictly, of course, we should allow them the possibility of 
migrating from one nucleus to any other, but it turns out that an electron 
which is helping to form a single bond between two atoms A and B has 
very little chance of moving away from the region between A and B on 
to other nuclei C, D . . . These electrons may be called localised 
electrons . 

We may illustrate these principles by considering first the simplest of 
all molecules H 2 . Here there are two electrons, and each is assigned 
the same wave function iff , with opposed spins. Near one of the nuclei, A, 
iff resembles the ordinary atomic orbital iff aj and near the other nucleus B 
it resembles We may take for its approximate form 

+ & . (*) 

This wave function is represented in fig. 4, where, on the left, we have the 
two separate atomic orbits out of which the final orbit on the right is 
formed. Such a wave pattern is symmetrical about the line AB, and we 



120 


C. A. Coulson 


might refer to it pictorially as a “sausage-type” orbit, since the electrons 
are each confined to the same region whose shape resembles that of a 
sausage. These orbits are known as cr-type orbits. In fig. 5 there is 
drawn the probability function if/ z for points along the line of symmetry 
AB, which joins the nuclei. It is clear from this diagram that the 



Fig. 4 . —Formation of molecular orbit for H 2 out of atomic orbits. 
Fig. 5 . —Probability function along the axis of H 2 . 


electrons in this bond are closely localised in the region between A and 
B, with very little'probability of moving away from the “bond direction.” 
These two electrons, with identical orbits and opposed spins, form what 
we may call an electron-pair ; they cause the molecule to repel other 
electrons, and indeed a certain amount of energy is usually needed to 
overcome this before the molecule can be made to react with other 
molecules. 

The energies of these orbits may be calculated, and it turns out that 
they are lower than those of isolated Hydrogen atoms. Thus there is a 





Quantum Theory of the Chemical Bond 121 

lowering of the total energy when two Hydrogen atoms join to form a 
molecule H 2 . This, with some small corrections, is the binding energy. 

The Hydrogen molecule-ion H 2 + , in which there is only one electron, 
is found in the discharge tube. This molecule will resemble H 2 in that 
the single electron has a bicentric orbit similar to that in fig. 4; but 
since this electron is not paired with a second electron to form a 
closed group, the ion will be very reactive. Indeed this is found to 
be the case, for it rapidly combines with other molecules and cannot 
be isolated. 

If we had considered HC 1 instead of H 2 the situation would have 
been very similar. For there are just two unpaired electrons available 
for forming bonds; one of them is the Hydrogen is electron, and the 
other is the Chlorine 3 p x electron. (The 3 p x orbit resembles the 2 p x 
orbit of fig. 1, and the x direction is along the line of the nuclei.) These 
two electrons together form a cr-type molecular orbit which may be 
described, as in (1), by the expression 

= +pift(3p x ) . 00 

The constants A and ju may be found mathematically, but they are no 
longer equal since the two constituent atoms are different. The orbit is 
like a “sausage,” with symmetry about the line of nuclei, but the 
“sausage” is fatter at one end (Cl) than at the other, corresponding to a 
greater chance of finding the electron near the more electronegative atom 
Cl. In fact, the ratio A : ft may be immediately related to the dipole 
moment of HC1. 

The simplest polyatomic molecule to illustrate the rules (#)-(/) is 
Water H 2 0 . The Oxygen atom has two inner K electrons which remain 
close around the nucleus and do not concern us further; there are six 
others, and two of them fill the 2s atomic orbital, forming a closed sub¬ 
group around the nucleus. Two others fill the 2 p z orbital (fig. 1) pointing 
in the z direction, and, being paired together, cannot take part in molecule 
building. The remaining two are one each in the 2 p x and 2 p v orbits 
(fig. 6). As we stated earlier, there is practically no hybridisation with 
Oxygen, so that the 2 s and 2 p wave functions do not mix. If we are to 
get maximum overlapping (rule (*)), we must put the two Hydrogen 
atoms along the x and y axes, as shown on the left in fig. 6. The resulting 
molecular orbits are shown on the right; two electrons each, with opposed 
spins, are allotted to each of the a-type bonds shown. Each orbit re¬ 
sembles a Hydrogen is orbit near the H nucleus, and an Oxygen 2/ orbit 
near the O nucleus, and is symmetrical about the line of the bond. We 
can see on this ground that the HOH angle should be about 90°; actually 



122 


C . A . Coulson 


repulsions between the two H atoms and a trace of hybridisation opens 
out the angle to about 105°. 

A similar discussion could be given for ammonia NH 3 . The N atom 
has three unpaired electrons, one each in the 2 p Xi 2p vy and 2 p z orbits. 
It forms three equivalent bonds with the H atoms, such that the HNII 
angles should all be 90°, and each bond involves two electrons in a cr-type 
orbit having opposed spins. Here again the H repulsions open out the 
HNH angles to rather more than 90°, making the molecule a somewhat 
flat pyramid. 

We can deal with Carbon compounds in the same way. If we 
suppose that the four unpaired electrons in Carbon are one each in the 
four tetrahedral orbits (A) of § 2, it can form four equivalent bonds, and 
the Criterion of Maximum Overlapping shows that these are directed to 
the four corners of a regular tetrahedron, making angles of 109° 28' with 
each other. In this way the familiar facts of stereochemistry in organic 



Fig. 6 . —The Water molecule. 


compounds involving only single bonds receive a natural explanation. 
If the four radicals which are bonded to the central Carbon atom are 
different among themselves, their mutual repulsions and different electron 
affinities will cause a slight dislocation of the tetrahedral angles, but it 
can be shown that this effect will not alter the angles by more than 2 or 
3 degrees. 

One more result concerning these single bonds follows from our 
previous discussion. In every case a single bond is represented by two 
electrons whose orbits are closely confined, or localised, in the region 
between two nuclei and, except in unusual circumstances, almost inde¬ 
pendent of the nature of the other atoms bonded to the two nuclei 
concerned. It follows from this that the energy of such a bond is almost 
independent of the other bonds present, and we are led to the idea of a 


123 


Quantum Theory of the Chemical Bond 

bond energy, characteristic of two bonded atoms, and practically constant 
for all molecules. In this way a table of bond energies could be drawn 
up, and it appears, naturally enough, that in these saturated molecules 
bond energies are constant and, apart from a few exceptional cases, are 
additive to give the total binding energy of the molecule. 


4. The Double Bond. 


A new situation arises when we come to the unsaturated molecules in 
which there occur double bonds; the characteristic case is Ethylene: 



c=c<( 


H 

H 


To explain the orbits of the electrons in this molecule we must go back 
to the trigonal wave functions already described in § 2. For if the two 



Carbon atoms are in the trigonal state we can soon form five “sausage” 
bonds, each of two electrons, such that there are three bonds from each 
Carbon atom: and the Criterion of Maximum Overlapping requires that 
these are all at angles of 120° with each other. This leaves us with one 
electron from each Carbon atom which is unpaired and not so far engaged 
in the formation of a bond. These two “loose” electrons are the dumb¬ 
bells perpendicular to the planes defined by the earlier orbitals, and the 
best that we can do is to pair these together somehow. Now if we arrange 
the two CH 2 planes so that they are coincident, the two dumb-bells are 
pointing parallel, and this is the direction in which they overlap most 
(fig. 7). It is true that the overlap is less than with the other bonds, and 
this accounts for the less stability of these orbits, but we cannot do better. 
If we now allow them to interact, these zp* electrons will form themselves 



124 


C . A . Coulson 


into molecular orbitals, as shown in the figure, in which the probability 
function resembles two streamers, one below and the other above, the 
plane of the C 2 H 4 skeleton. It can easily be seen that if we divide this 
wave pattern into two halves by a plane perpendicularly bisecting AB, 
the two halves bear a close resemblance to the atomic 2 p z orbits from 
which the molecular orbit is formed. The wave function for this new 
kind of molecular orbital may be written down in a manner similar to 
that used for H 2 . Thus near nucleus A the orbital resembles 
and near nucleus B it resembles ip b (2p z ); its approximate description is 
therefore 

ift).( 3 ) 

It is precisely this streamer effect, in which the two electrons are paired 
with opposite spin, each having a wave function such as that in fig. 7, 
that converts a single bond into a double bond, and which is characteristic 
of every double bond. It is important to realise that both electrons have 
the same “double-streamer” wave function; it is not a case of the top 
streamer representing one electron and the bottom streamer the other 
electron. The two streamers go together; they are inseparable, and 
both together constitute that “extra” which, superposed on a single bond, 
converts it into a double bond. These streamers, which of course are 
very far from being symmetrical about the line joining the nuclei, are 
called a 7r-bond. 

If this description of an Ethylene molecule is correct—that four of the 
bonds are of cr-type, each with two electrons, and the fifth has four 
electrons, two being in a a-type orbit and the other two in a “double¬ 
streamer” ir-type orbit—it is necessary that the molecule should be 
planar (otherwise the ^-electrons cannot overlap as much as possible); 
also the angles between all the bonds should be 120°, and there should 
be some considerable resistance to twisting one of the CH 2 groups relative 
to the other. Every one of these inferences has received ample confirma¬ 
tion at the hands of the spectroscopists, crystallographers, and X-ray 
workers. 

The double bond is thus very far from being two single bonds super¬ 
posed, and, on account of the double-streamer electrons, it will have 
characteristic properties not possessed by single bonds. One of these is 
its high reactivity; this is explained by the fact that these streamer 
electrons are not so tightly bound as the other electrons because the 
atomic orbits .from which they are built up do not overlap so much; 
consequently it is easier to disengage them from one another and link 
them to other atoms, to form a new molecule. 

The triple bond, about which we shall not speak further here, differs 



Quantum Theory of the Chemical Bond 125 

from the double bond simply in the addition of an extra pair of streamer 
electrons, but the plane of these new ones is perpendicular to that of the 
former pair. As a result we have a bond that is somewhat more sym¬ 
metrical about the line joining the nuclei, and less resistant to twisting of 
the two halves relative to each other. 

The description that has been given above for the C = C bond would 
apply just as well for other double bonds, as 0 = 0 or S = S, except that 
there are none of the hybrid single bonds present. With mixed bonds, 
however, such as C = 0 , the double streamers will be thrown more over 
on to one or other of the nuclei (with C = 0 , the O nucleus) on account of 
its greater electronegativity. In every case the electrons are allotted to 
orbits which are localised in the region of the two nuclei, and the normal, 
or pure, double bond, like the normal single bond, has a characteristic 
binding energy which is additive throughout the molecule. 

5. Conjugated Compounds. 

When we come to discuss more complicated compounds in which 
single and double bonds occur alternately, the dumb-bell electrons which 



Fig. 8 . —The Allyl radical. 


we used in Ethylene to convert a single bond into a double bond become 
even more important. Let us consider the Allyl radical C a H s (fig. 8). 
Each of the bonds marked in the diagram (a) will absorb two electrons, 
and will form a set of a-type bonds. There remains one electron from 
each Carbon atom not so far engaged in molecule formation. If the 
angles of the bonds are 120 0 so that the Carbon atoms are in the familiar 
trigonal state, these free electrons are the dumb-bells perpendicular to 
the planes of the other bonds. We have supposed that all the atoms lie 
in a plane (a situation that has been confirmed in other similar molecules 
by spectroscopic analysis), so that these three dumb-bells point in parallel 
directions perpendicular to the plane of the paper; they are marked 
symbolically with a cross. The molecular orbits for these three electrons 
are different from any others yet described; for it turns out that these 
orbits have to embrace all three nuclei; they are tricentric orbits and we 



126 


C . A . Coulson 


cannot localise them in pairs to form a double bond between any two 
of the nuclei. In other words, these three electrons are free to move 
over the whole Carbon skeleton. For this reason they have been called 
mobile electrons. (Other names in current use are ^-electrons, ^-electrons, 
or unsaturation electrons.) In more complex molecules they may be 
able to move over as many as ten or a dozen nuclei, and they resemble 
very closely the conduction electrons of a metal. The binding effect of 
these mobile electrons is, indeed, intermediate between that of an ordinary 
molecule with its localised bonds, and a metal where there are no rigidly 
maintained bonds at all. It is this easy mobility of these electrons that 
accounts for the anomalous electric and magnetic susceptibilities of 
aromatic molecules, and also for the readiness with which influences may 
be transmitted from one part of the molecule to another. The ortho-para 
directing property of substituted benzenes finds a ready explanation in 
this manner. 

Returning then to the Allyl radical, we have to place the three mobile 
electrons in orbits that embrace all three nuclei, and which, in the 
neighbourhood of any one of them, resemble one of the dumb-bell 
atomic orbits. If we call the nuclei I, 2, 3 and write ift l9 an< i ^3 for 
these dumb-bell orbitals, the molecular orbitals will, by analogy with 
the case of which we discussed earlier, be written 

'P = c i'l’i + c z'P a + 'a'/'s- 

The constants e u c s> and c 3 have to be chosen so that this wave function 
is as nearly as possible a solution of the wave equation. This can be 
done by a minimum energy method, and the analysis closely resembles 
that required in finding the normal modes of vibration of three particles 
oxi the three Carbon centres. It can be shown that there are three 
possible molecular orbits of the required character, and their energies 
may be calculated. If B 0 is the energy of an isolated 2p t atomic orbit, 
these molecular energies are 

£9 +V 2 & £ 0 , ~ V 2 P> 

where is a certain constant, with a negative value, whose magnitude 
may be obtained from comparison measurements on Ethane and Ethylene. 
The first of these orbits has the greatest binding energy, and we therefore 
allot two electrons to this level; the remaining electron goes into the 
middle level, so that the total energy of the mobile electrons is 3 E 0 + 2\/2/3. 
If we had supposed (incorrectly, as it now appears) that two of the 
electrons had formed a double bond between nuclei 1 and 2, while the 
third electron remained unpaired on nucleus 3 (diagram (£)), or, alterna- 
;tiv$y, that double. bond had been between ,2 and 3 with the unpaired 



Quantum Theory of the Chemical Bond 


127 


electron localised on nucleus 1 (diagram (<?)), the energy would have 
been 3 E 0 + 2 j 3 . Since /? is numerically negative, there is thus an increase 
of binding energy due to the extra mobility of these electrons, above that 
to be expected on the basis of simple additivity of bond energies. We 
may describe the resulting bonds as non-localised bonds, thereby 
differentiating them completely from the previous localised bonds. The 
extra energy gained by removing the restriction to localisation has been 
called the resonance energy. Its value has been determined experi¬ 
mentally for many of these compounds. In the case of Allyl the resonance 
energy is about 15 kcals. 

Another physical interpretation of this resonance energy is sometimes 
given as follows (although it does not fall strictly within the present line 


H 



H 

« to 


Fig. 9.—Benzene. 


to 


of thought): if we had tried to draw a chemical bond diagram for this 
molecule we should have been forced into either ( b ) or (c) of fig. 8. If 
we say that neither of these states represents the true picture of this 
molecule, but that there is very rapid resonance from one to the other 
and back, this resonance taking place so rapidly that it cannot be observed, 
then, on the basis of mechanical problems in which resonance occurs, we 
shall expect a lowering of the energy. . In this case the lowering is about 
15 kcals, and on account of the rapid changing about, the two bonds 
cannot be called either single or double bonds, but partake of the 
properties of both. This interpretation is the basis of the electron-pair, 
or valence-bond, approximation. 

A similar, and extremely important, molecule is Benzene C e H 6 
(fig. 9 (a)). It is known from a variety of evidence that all the atoms, 
Hydrogen and Carbon, lie in a plane. One of the classic problems of 
organic chemistry has been to decide how the electrons are allotted to 
this molecule, and what kinds of bonds they will form. The solution 
that went furthest to explain the chemical behaviour of Benzene was that 



128 


C. A. Coulson 


of Kekute, who supposed that instead of having either of the two possible 
configurations ( b ) and (c) in which single and double bonds alternate, the 
molecule was continually changing, oscillating from one structure to the 
other. But on our present viewpoint this can be explained somewhat 
differently. For when we come to allot the electrons, we shall first allot 
two electrons to each of the twelve “sausage” bonds, and we see that 
for this to be possible the Carbon atoms must each be in a trigonal state. 



Fig. io. —Overlapping of atomic ff-type orbits Mobile electrons in Benzene C„H a . 

in Benzene. Fig. II. 


The atomic orbits from which these “sausage” bonds are constructed 
are shown in fig. io, which explains why this molecule and other aromatic 
molecules form plane rings containing six Carbon atoms. (In no other 
way can we get the required optimum angles of 120 0 for maximum 
overlapping from neighbouring atoms.) A simple counting of electrons 
shows that there are now six electrons left unpaired; these are the dumb¬ 
bell electrons, all perpendicular to the plane of the hexagon, and therefore 
parallel. If these electrons did not interact with each other to form 
molecular orbits, they would resemble the top picture of fig. ii . However, 
when we allow them to interact, they become free to migrate from one 
nucleus to another; they are mobile electrons like the mobile electrons 
tof the Allyl radical, and they take up molecular orbits which we could 


Quantum Theory of the Chemical Bond 129 

represent diagrammatically by the lower picture in fig. u. The streamer 
bonds which we saw were typical of Ethylene are now spread out and 
we have two streamers going right round the molecule, above and below 
the central plane. 

If we call the nuclei 1 . . . 6, then these molecular orbitals are repre¬ 
sented by suitable sums of the atomic orbitals: 

^=^1 + ^2+ . . . +<r 6 ^„. 

The constants c T must be chosen so that this function is as good a solution 
as possible of the wave equation. The necessary calculations are quite 
simple and it can be shown that there are six possible molecular levels, 
each with a different set of constants c r , and corresponding energies E r . 
If E 0 and have the same meanings as before, the actual orbitals and 
energies are shown in the following table:— 

x ¥‘i='f'i+rf >2 + 'l t 3+'l'i + 'l t s + 'Pe, E 1 =E 0 + 2 ^ 

T 2 = (/f 2 +i/r 3 -i/f 5 -(/ , 6j E 2 ^=E 0 + P 

^’s = 2^ 1 +</'2-'A3-#4-^5 + ^ 6 » E s = E 0 + P 

^4=2^1-^2-^3 + 2^4-^5-06» E^Eg-^ 

X ¥ i = ifjz-xfj3+if; 5 -i/j 6 , £ i = E 0 -^ 

X P B =1p 1 -4’2+ i f , 3-'f’i + 'l J 5-'P<l> £(S~Eq 2 ft 

The first three of these levels may be called bonding orbits, since the 
energy is less than that of an isolated 2 \p z atomic orbit (j8 negative). The 
other three are anti-bonding. In the normal state of Benzene there will 
be two electrons in each of the bonding orbits, so that their energy is 
2 (E x +E 2 +E 3 )= 6 E 0 + 8/?. Now the energy of a single Kekule structure 
such as fig. 9 ( b ) or (f) is 6 E 0 + 6 fl, so that the resonance energy is 2/?, 
i.e. about 30 kcals; this agrees quite well with the experimental 
value. 

Confirmation of the point of view regarding Benzene can be obtained 
from two other independent sources. In the first place the occupied 
orbits Tj, T 2 , and T* 3 embrace all six Carbon nuclei, and we may interpret 
this to mean that the electrons in these orbits can move in a “closed 
circuit” round the Benzene ring. In the absence of any magnetic field 
they are equally likely to travel in either direction round the ring, but 
in the presence of such a field there will be a preference for one rather 
than the other; thus there should be a large diamagnetism when the 
magnetic field is perpendicular to the plane of the molecule. This is, 
indeed, precisely what is found experimentally, and its magnitude is in 
accord with the predicted “electron current” round this small circuit. 
The second confirmation comes from the absorption spectrum. The 
characteristic absorption of Benzene in the ultra-violet may be explained 



130 


C. A. Coulson 


by supposing that one of the electrons in the molecular orbits X F X - X F 3 is 
excited into one of the unoccupied orbits X F 4 -'F 6 . Explicit calculations 
show that the frequency of this absorption band is just what we should 
have expected it to be on this theory. It may be added here that the 
characteristic colour of many of the organic dyes is explained in precisely 
the same way, as being due to the excitation of a mobile electron from 
an occupied (and bonding) orbit to an unoccupied (and anti-bonding) 
orbit. 


Similar analysis to that which we have described for Benzene can be 
carried through for other molecules of this nature. Naphthalene (fig. 12) 



C 10 H 8 is an example; the nuclear framework 
is planar, and the ten mobile electrons which 
remain after allocating the localised <r-type 
bonds are free to move over all the ten 
Carbon nuclei. We should write for the 
wave function of a typical member of these 
mobile electrons 


Fig. 12.—Naphthalene. 


ifj — + ^202 + • • • + 


and choose the constants c r so that this is as good a solution of the wave 
equation as possible. The calculation of the energies and the c T involves 
the solution of a determinantal equation of the 10th degree, but this is 
not impossible, and the wave functions and energies can be determined. 
The calculated resonance energy is in tolerably good agreement with the 
experimental value. 

The table below gives a few of the calculated and experimental values 
for the resonance energies of some of these unsaturated molecules. It 
can be seen that the agreement with experiment is reasonably good, in 
view of the nature of the approximations used. 


Resonance Energies (KCals). 


Molecule. 

Ethylene 

Butadiene 

Hexatriene 

Octatetraene 

Benzene 

Diphenyl 

Naphthalene 



We may use these calculations to study the way in which such 
unsaturated molecules are hydrogenated; for suppose that two atoms of 
Hydrogen are added on to the Naphthalene molecule. This will reduce 



Quantum Theory of the Chemical Bond 131 

the number of mobile electrons, and alter the possible types of molecular 
orbits. The energy can be calculated for each of the 18 possible mole¬ 
cules that can thus be formed, and it appears that the 1,2 addition is 
energetically the most preferred. This is, indeed, what is found experi¬ 
mentally. The course of further successive additions of Hydrogen can 
be followed in a similar manner. 

6 . Bonds of Fractional Order. 

There is a further interesting development in this particular field 
of mobile electrons. As a result of recent improvements in X-ray and 
electron-scattering apparatus, and in analysis of infra-red and Raman 
spectra, it has become possible to determine the lengths of the C-C bond 
in many molecules. Certain conclusions can be drawn. 

For in cases where we have been accustomed to draw a single bond, 
and in which the Carbon atom is in the tetrahedral state which we ex¬ 
plained earlier, Ethane C 2 H 6 being the typical example, the C-C link is, 
almost within the limit of measurement, constant and equal to 1-54 A. 
In cases such as Ethylene C 2 H 4j where we have an isolated double bond, 
the link is about 1*33 A., and in cases of a triple bond it is about 1*20 A. 
In fact, wherever we have localised electron orbits, and consequent 
additivity of bond energies, we also find definite values for the bond 
lengths. But in molecules such as those of § 5 (Benzene, Naphthalene, 
etc.), where there are non-localised or mobile electrons, the links have 
lengths which are neither those of a pure single nor of a pure double bond, 
but are intermediate between them; it is natural to describe them as 
links of fractional order, and we may draw a curve (fig. 13) of order 
plotted against length. This curve would enable us to fix the order if 
we knew the length, or, which is of more use in practice, would enable 
us to predict the length if we could calculate the order. 

The explanation of these queer bonds of fractional order, which are 
neither single nor double, but intermediate between the two, must lie 
in the behaviour of the mobile electrons—for these electrons repre¬ 
sent the only essential difference from the normal single and double 
bonds. 

Let us split up the order of a bond, as for example in Benzene, into 
two parts: the one part, which is unity, arises from the single bonds 
which we saw were first established; if we call the remainder^, then^, 
which is generally a fraction between o and 1, will be the contribution 
to the order from the mobile electrons; we might even call it the 
mobile order of the bond. Then p=o for pure single bonds, as Ethane; 

P.R.S.E.—VOL. LXI, A, 1941-42, PART H 9 



132 C. A. Coulson 

p~i for pure double bonds, as Ethylene; p~2 for pure triple bonds, as 
Acetylene, 

Let us consider one of the mobile electrons; we have seen that its 
wave function is of the form 

1+^2+ • ■ • + *»&»> 

where the Carbon nuclei are numbered from i to n , and ifti is the dumb¬ 
bell orbit round nucleus I, etc. This electron has an orbit which extends 



over all the nuclei, and it may be thought of as spending a certain fraction 
of its time in the region between any two adjacent nuclei. If this fraction 
is large, we can say that this particular electron contributes considerably 
to the mobile order of the bond between these nuclei. We can make this 
more precise by saying that if our original wave function $ is normalised, 
is the probability of finding the electron round nucleus r, and c s 2 the 
probability of finding it around nucleus >r. Unless both c T and c s are 
reasonably large, there is little chance of finding it between these two 
nuclei. Let us therefore define the contribution of this electron to the 
mobile order p between nuclei r and as c r c s . This applies, of course, 
only to neighbour nuclei. The total mobile order of a bond is therefore 





Quantum Theory of the Chemical Bond 133 

the sum of contributions c T c s from each of the mobile electrons present: 
we may write it 

p = Tuc r c & , 

and the total order of the bond between r and s is i + T 1 c t c s . 

It can easily be verified that this definition of order gives a correct 
value for Ethylene (p = i) and Acetylene (J> = 2). Further, it can be shown 
on this definition that the energy of a bond of mobile order p is p times 
a single-bond energy +(1 - p ) times a double-bond energy. Our theory 
therefore provides a reasonable justification for the use of the idea of 
fractional order. 

It is not difficult now to calculate the order of any bond and thus, 
from the curve, to predict the length: for we have merely to determine 
the coefficients c r in the wave function for each mobile electron. 

When we make this calculation for Benzene we discover that all the 
links are equivalent, and ^> = 2/3, so that the total order is 1*67. Reading 
off from the curve gives the length as 1-39 A., which is exactly the 
experimental value. Similar calculations for the graphite crystal, which 
is like a huge molecule of this type, give an order 1*55 and a length 
1-42 A., in agreement with the observed value. 

The conclusion that all the bonds in Benzene are equivalent is very 
interesting because it resolves the problem of the two Kekuld structures; 
the molecule must not be thought of as being in either of the states usually 
drawn, but rather in an entirely different state in which single and double 
bonds do not appear but all the bonds are exactly equivalent, approxi¬ 
mately two-thirds of the way between a pure single and a pure double 
bond. 

Naphthalene (fig. 12) illustrates the same phenomenon. Using the 
notation shown in the figure, the orders and lengths of the links are 
calculated to be those in the following table:— 

Link A BCD Mean 

Order 1-518 1*555 1*725 1*603 1*622 

Length * (A) 1417 1402 1*371 1400 1*394 

It is seen that the bonds are all very much the same, though those labelled 
C (the 1-2 bonds) are most nearly double bonds. This result is in com¬ 
plete accord with the chemical experiments which show that the 1,2 
position is very reactive. The detailed verification of the lengths in 
this molecule cannot be made, but the mean length agrees well with 
experiment, and the A link is believed, from X-ray analysis, to be some¬ 
what longer than the others. 

* The last decimal in this and other predicted lengths is important only for the 
purposes of comparison. 



134 


C . A. Coulson 


When two Benzene rings are bonded together, as in Diphenyl (fig. 14), 
the atoms all lie in one plane. It can be shown that the links in each 
Benzene ring are hardly affected in length, but the apparent single bond 

\ joining the two rings has an order 
\ 1*37 and a length P45 A. It is 

y very far from being the straight- 
/ forward single bond that it is usually 
y drawn! The experimental value for 
Fig. 14.—Diphenyl. this link, from X-ray analysis, is 

slightly higher, 1*48 A., but the accuracy of this latter value does not 
exceed 0*02 A. 

A series of interesting molecules is shown in fig. 15. They are all 
planar molecules, and the mobile electrons are able to migrate from one 
end to the other; this explains why a substitution in one Benzene ring 
will affect conditions of further substitution at the far end of the other 




Phenylethylene 




ring. The interesting link in all these molecules is the link A, which is 
usually drawn as a single bond. However, it is seen from the table below 
that its length is about half-way between a pure single bond 1*54 A. and 
a pfire double bond 1-33 A. 

Molecule. ; Length of Link A. 

Phenylethylene I *44 A. 

Stilbene 1*44 A. 

Tolane 1*40 A. 



Quantum Theory of the Chemical Bond 13 5 

The lengths of these links have been measured by X-ray analysis, and 
agree with the calculated values to within 0*01 A. 

Another interesting series of molecules is the conjugated chains 
C2«H 2 „ +2 > w hich Ethylene is the first example. These molecules 
are usually written with alternating single and double bonds, starting at 
each end with a double bond. However, there is a variety of evidence, 
from optical rotation and the parachor, to show that the bonds are not 
so simple as this. If we sup¬ 
pose that the nuclei lie in a 
plane and form some kind of 
zigzag chain of which the 
angles are all 120°, as for 
example in octatetraene (fig. 

16), then the mobile electrons 
are able to migrate over the whole molecule, and we may use our previous 
technique to describe the orders and lengths of the various bonds. The 
table below shows the lengths of the first few members of this series. It 
appears that the links do alternate in size, but that these alternations are 
much less marked away from the end, and the bonds all tend to a uniform 
length rather nearer to the double bond than to the single, as we move 
towards the centre of the molecule. None of these bonds have lengths 
corresponding to a pure single or a pure double bond. If we deal with 
the infinite conjugated chain it appears that the same process takes place, 
but that differences in length become inappreciable after about the seventh 
from each end. 



Lengths in the Conjugated Chains C 2n H 2n +2- 


Molecule. n . Bond Diagram. Link Numbers. 

Ethylene 1 = 1 

Butadiene 2 = - = 1, 2, 1 

Hexatriene 3 = - = — = 1, 2, 3, 2, 1 

Octatetraene 4 =- = - = -= 1, 2, 3, 4, 3, 2, 1 

Link Number I 2 3 4 

Ethylene 1*331 .. .. 

Butadiene 1*345 1*432 


Hexatriene 1*351 1*424 1*366 

Octatetraene 1*353 1*422 1*371 1*415 

In this section, and in the preceding section, we have discussed bonds 
of fractional order, or non-localised bonds, between pairs of Carbon 
atoms. But our analysis, and the general results, hold equally well for 
other atoms which form multiple bonds, such as Nitrogen, Oxygen, 
Sulphur, or Chlorine. 




136 


C. A. Coulson 


7. Other Applications. 

There are many other applications of the methods that have just 
been described; we may refer briefly to three of them. 

(a) Rotation about a Conjugated Single Bond .—When we discussed 
the Diphenyl molecule (fig. 14) in which a so-called single bond con¬ 
nected two Benzene rings, we found that it was not in fact a true single 
link at all, but had an order 1 *37. This implies that it has certain charac¬ 
teristics of a double bond, and we should therefore expect that it would 
exert an influence tending to restrict any possible rotation about it. 
The magnitude of this effect, and a similar effect in all such molecules 
where a single bond is to be found between two double bonds, or Benzene 
rings, has not been calculated accurately; but an approximate value 
may be found as follows. If we rotate the two Benzene rings till their 
planes are at 90° to each other, the mobile electrons will not be able to 
move from one-half of the molecule to the other, because the directions 
of the relevant component dumb-bells are perpendicular. The loss of 
energy in this step is found to be about 7 kcals, and this would there¬ 
fore be approximately the height of the potential barrier (or of that 
part of it which arises from these electrons) that resists such free 
rotation. 

(b) Vibrations .—The second application is to vibration frequencies. 
It is possible to calculate how the energy of these unsaturated molecules 
depends upon the scale of the molecule; all that is needed is a knowledge 
of the manner in which the quantity /3 introduced in § 5 depends upon 
the separation of two neighbouring Carbon atoms, and this may be found 
from the vibration frequencies of Ethane and Ethylene. 

We have seen in Benzene, for example, that the equilibrium con¬ 
figuration is a regular hexagon of side 1-39 A. Suppose that the whole 
molecule expands without change of shape; we may call this “breathing ”; 
then the change in energy can be calculated and from this the frequency 
of these scale-model vibrations can be determined. The calculated value 
is 1038 cm. -1 , which compares excellently with the experimental value of 
991 cm. -1 . It is not, however, yet possible to discuss in this way vibrations 
in which the angles of the hexagon change because this involves a 
distortion of the trigonal bdnds in the plane of the hexagon. 

Other vibrations could be studied in which the different links varied 
in a different manner. For example, in the Allyl radical CH 2 -CH-CH 2 , 
which we discussed earlier, it is possible to determine the energy of the 
molecule as a function of x x and * 2) the lengths of the two C-C links. 
The equilibrium configuration, with minimum energy, corresponds to 



Quantum Theory of the Chemical Bond 


137 


# 1 =3r 2 = 1*37 a., but it is possible from a knowledge of the energy 
contours to estimate the frequencies of those normal modes in which the 
angles at the central Carbon atom remain 120°. 

(c) Polymerisation .—The third, and last, application of the theory of 
mobile electrons is to the question of polymerisation, a matter of con¬ 
siderable importance in the field of plastics and synthetic rubbers. Here 
there is interaction between mobile electrons of one molecule and another; 


(0 

( 2 ) 

(3) 

(4) 


r 1) C=CH—CH=CHo H 2 C=CH—CH=CH 2 

X'"~SX.'- —'OC-" NX 

HoC—CH—CH—CHo H 2 C—CH—CH—CH 2 

1 _1 

4 


--X--NX--NX--- 

H 2 C—CH—CH—CH 2 
—H 2 C—CH=CH—CH 2 


-- x x x 

H 2 C—CH—CH—CH 2 

4 

f- h 2 c—ch=ch—ch 2 — 


Fig. 17.—One possible polymerisation of Butadiene. 


such interaction is possible if the molecules are arranged suitably relative 
to each other, and may perhaps form a temporary union by which 
other transformations take place. Thus in the example shown in 
fig. 17, between two Butadiene molecules, in (1) we show the conventional 
formulae, and in (2) the diagrammatic representation by means of mobile 
electrons; in (3) there is a fusion of the two molecules, the mobile 
electrons forming resonance across the ends; and in (4) there is a re¬ 
arrangement of the bonds and formation of a longer chain. In this way 



Fig. 18.—Another possible polymerisation of Butadiene. 


the building up of huge molecules such as occurs in polymerisation may 
possibly take place. It is not necessary, however, for the interaction to 
take place across the ends of both reacting molecules, and it might, as is 
often supposed, equally well take place between other Carbon atoms of 
the two molecules, as in fig. 18, to form vinyl cyclohexene. 

There are other effects which we might have outlined, such as dipole 
moments, electronic excitation, electric and magnetic susceptibilities, and 
preferential substitution; in part these are already explained, but there 
is still much work to be done, particularly from the quantitative point of 
view. Further progress will depend increasingly upon a fusion of interest 
between the theoretical and experimental worker. 



138 C. A. Coulson 

Summary. 

Wave mechanics is able to describe with some precision the motions 
of electrons in atoms, but when we study molecules we have to use more 
approximate descriptions. It turns out that what the chemist is accus¬ 
tomed to call a single bond is in reality a pair of electrons, having opposed 
spins, describing equivalent orbits which have symmetry about the line 
joining the two nuclei concerned; this may be called a localised bond. 
The tetrahedral character of the bonds from saturated Carbon atoms are 
easily fitted into this scheme. 

In Ethylene, however, another type of orbit appears; this is the 
double-streamer orbit, and two electrons in this orbit convert a normal 
single bond into a double bond. Again the bond is a localised bond, 
with a characteristic energy and length. 

In more complex molecules, such as Benzene, there is a framework 
of single bonds, and the remaining electrons have orbits that embrace 
all six of the Carbon atoms; these mobile electrons give the aromatic 
and conjugated molecules their characteristic properties, but as a result 
the bonds are neither pure single bonds nor pure double bonds, but a 
hybrid of the two, and the electrons in these bonds are no longer localised 
in the region between any two particular nuclei. The energies of these 
molecules can be calculated in fair agreement with experiment, and from 
a knowledge of the wave function it is possible to define an order, which 
is usually fractional, for these bonds. In Benzene all the C-C links 
are equivalent, and their order is if, 

A curve which connects the fractional order with the length of the 
bond enables us to predict the lengths of these bonds, and, where 
experimental comparison is available, agreement is found. 

These mobile electrons are important in a study of vibration frequencies, 
restricted rotation about C-C bonds, and in polymerisation. 


REFERENCES TO LITERATURE. 

It is impossible in an account of this nature to give detailed references 
to all matters discussed in the text, but excellent general accounts of 
the subject, including detailed bibliographies, are to be found as follows;— 

Mulliken, R. S., 1932. “ Electronic Structure of Polyatomic Molecules and 
Valence,” Phys. Bev., vol, xli, p. 49, (For the molecular orbital method.) 
Pauling, L., 1939. Nature of the Chemical Bond , Cornell Press. (For the 
electron-pair method.) 



INDEX SLIP 


Prqc. R.S.E., Vol. LX I, Part II 
Section A (Mathematical and Physical Sciences) 


Coulson, C. A. —Quantum Theory of the Chemical Bond. 

Proc. Roy. Soc. Edin., vol. lxi, a, 1941-42, pp. 115-139. 

Chemical Bond, Quantum Theory of the. 

C. A. Coulson. 

Proc. Roy. Soc. Edin., vol. lxi, A, 1941-42, pp. 115-139. 

Quantum Theory of the Chemical Bond. 

C. A. Coulson. 

Proc. Roy. Soc. Edin., vol. lxi, a, 1941-42, pp. 115-139. 


Edge, W. L. —Some Remarks occasioned by the Geometry of the Veronese 
Surface. 

Proc. Roy. Soc. Edin., vol. lxi, a, 1941-42, pp. 140-159. 

Induced Matrices: used in connection with a Veronese Surface. 

W. L, Edge. 

Proc. Roy. Soc. Edin., vol. lxi, A, 1941-42, pp. 140-159, 

Outpolar Quadrics to Veronese Surface: (1,1) Correspondence between 
Plane Quartic Curves and. 

W. L. Edge. 

Proc. Roy. Soc. Edin., vol. lxi, A, 1941-42, pp. 140-159. 

Ternary Quartic: certain Concomitants obtained by Consideration of its 
Relation to a Quadric Outpolar to a Veronese Surface, 

W. L. Edge. 

Proc. Roy. Soc. Edin,, vol. lxi, A, 1941-42, pp, 140-159. 

Veronese Surface, Some Remarks occasioned by the Geometry of, 

W. L. Edge, 

Proc. Roy, Soc. Edin., vol. lxi, a, 1941-42, pp. 140-159. 


Whittaker, E. T,—Some Disputed Questions in the Philosophy of the 
Physical Sciences, 

Proc. Roy, Soc, Edin., vol. lxi, a, 1941-42, pp. 160-175. 

Philosophy of the Physical Sciences, Some Disputed Questions in the. 

E. T. Whittaker. 

Proc. Roy, Soc. Edin., vol. lxi, A, 1941-42, pp. 160-175, 

Physical Sciences, Some Disputed Questions in the Philosophy of the. 

E. T. Whittaker. 

Proc. Roy. Soc. Edin., vol. lxi, a, 1941-42, pp, 160-175. 



Lawley, D. N,—Further Investigations in Factor Estimation. 

Proc. Roy. Soc. Edin., voL lxi, a, I94I-42, pp. 176-185. 


lectors, Estimation of, 

D. N, Lawley, 

Proc. Roy. Soc. Edin., voL lxi, a, 1941-42, pp. 176-185. 


Aitken, A. C., and Silverstone, H.~~On the Estimation of Statistical 
Parameters. 

Proc. Roy. Soc. Edin., voL lxi, a, 1941-42, pp, 186-194* 


Silverstone, II. See Aitken, A. C., and Silverstone, H. 


Statistical Parameters, Estimation of. 

A. C. Aitken and H. Silverstone. 

Proc. Roy. Soc. Edin., vol. lxi, A, 1941-42, pp. 186-194. 



QuanUim Theory of the Chemical Bond 


139 


Van Vleck, J. H., and Sherman, A., 1935. “Quantum Theory of Valence/ 5 
Rev. Mod. Rhys., vol. vii, p. 167. (General account of both methods.) 

The following papers illustrate certain of the fundamental points in 

the quantum theory of valence. They are listed in chronological order. 

Heitler, W., and London, F., 1927. “Neutral Atoms and Homopolar 
Binding/ 5 Zeits. Rhys., vol. xliv, p. 455. 

Hund, F., 1928. “Description of Electrons in Molecules/ 5 Zeits . Rhys., vol. li, 
P- 759 - 

Lennard-Jones, J. E., 1929. “Electronic Structure of some Diatomic Mole¬ 
cules/ 5 Trans. Faraday Soc., vol. xxv, p. 688. 

Huckel, E., 1930. “Quantum Theory of the Double Bond/ 5 Zeits. Rhys., 
' vol. lx, p. 423. 

Pauling, L., 1931. “The Shared-Electron Bond/ 5 Journ. Amer. Chem. Soc., 
vol. liii, pp. 1367, 3225. 

Slater, J. C., 1931. “Molecular Energy Levels/ 5 Rhys. Rev., vol. xxxviii, 
p. 1109. 

Penney, W. G., 1934. “Structure of Benzene/ 5 Rroc. Roy . Soc. Lond., vol. 
cxlvi, A, p. 223. 

Coulson, C. A., 1938. “Bonds of Fractional Order/ 5 Rroc. Roy. Soc. Lond., 
vol. clxiv, A, p. 383. 


{Issued separately December 16, 1941.) 



140 


W . L . Edge , Some Remarks occasioned by the 


XII.— Some Remarks occasioned by the Geometry of the 
Veronese Surface. By W, L. Edge, Mathematical Institute, 
University of Edinburgh. 

(MS. received April 22, 1941. Read July 7, 1941.) 

Introduction. 

The subject-matter of these pages may be briefly summarised as follows: 
the geometry of the Veronese surface, with an algebraic representation of 
it that does justice to its self-dual character; the relations of the secant 
planes of the surface to quadrics which either contain the surface or are 
outpolar to it; and the derivation of an invariant and two contravariants 
of a ternary quartic in the light of the (1, 1) correspondence between the 
quartic curves in a plane and the quadrics outpolar to a Veronese surface. 
There is no suggestion of discovering fresh properties of the surface, 
though possibly the results in § 12 and § 13 may be new; but the geo¬ 
metrical considerations lead naturally to some algebraical results which 
it seems worth while to have on record, such as, for example, the identity 
8.2 and the remarks concerning the rank of the determinant which appears 
there, and the form found in § 13 for the harmonic envelope of a plane 
quartic curve. These algebraical results lie very close to properties of the 
surface; so close in fact that one might say that the Veronese surface is the 
proper mise en scene for them. 

The introduction of the factor V 2 into our equation 2.1 may seem a 
trivial matter; yet it is essential The fountain-head of the subject is 
Veronese's paper of 1884, where the surface made its first appearance and 
as a result of which it was given its name; and the plain fact is 
that Veronese's algebra on pp. 355 and 356 is wrong, and wrong solely 
because of the omission of this normalising factor. Veronese recognised 
at first sight the symmetry and regularity of the configuration that is built 
upon the surface which geometers have named after him; he supposed, 
and rightly, that its self-dual nature would be shown too by the algebra 
if the co-ordinate system were properly chosen. And his geometrical 
insight enabled him to give the correct results without depending on any 
algebra to discover them, while he was so sure of the geometry that he must 
never have troubled to subject his algebra to any test. So he says, for 



Geometry of the Veronese Surface 


141 

example, that the quadric 2^- 2 =o touches the surface given by (5) of his 
§ 16 along a quartic curve; a statement manifestly untrue for his equation 
as it stands, but which would become true if his equation were replaced by 
our 6.1. The crux of the matter is that the surface which is given, as a 
locus of points, by (5) of Veronese’s § 16, and the surface which is given, as 
an envelope of primes, by (1) of Veronese’s § 18, are two different surfaces 
and not the same surface. 

The parametric equations of a Veronese surface regarded, on the one 
hand, as a locus of points and, on the other hand, as an envelope of primes, 
ought to be symmetrical in the point and prime parameters; this is so with 
our equations 6.1 and 6.2. But Veronese only achieved symmetry at the 
expense of accuracy; while Bertini, in his textbook, is accurate at the 
expense of symmetry. 

The classical textbook exposition of the properties of the Veronese 
surface is Bertini’s; the self-dual character of the configuration is fully 
brought out in the geometrical sections of Bertini’s chapter, but the 
algebra, though perfectly correct, does not attain the same balance. 
Perhaps it was the consciousness of this that caused Bertini to relegate 
the parametric form of the surface in prime co-ordinates, corresponding to 
our 6.2, to a footnote. In any event the impression derived from this 
otherwise excellent chapter cannot but be that the algebra is inadequate 
to do justice to the configuration. The slur thus cast upon the algebra is 
unmerited, and it is high time it was removed. 

This paper falls into two parts, the first consisting of §§ 1-10 and the 
second of §§ 11-16. 

A certain proportion of the first part is naturally concerned with setting 
up the configuration; the fact that the algebra has the desired symmetry 
seems to justify the writing of a few pages, from this aspect, about a con¬ 
figuration whose properties are already well known. When descriptive 
arguments are employed, as, for instance, in § 7 and § 10, it has been 
thought preferable to argue directly from the geometry of the configuration 
itself, and not to invoke properties of systems of conics in a plane, as has 
often been done in order to obtain properties of the configuration. In § 5 
a condition is found for a secant plane of a Veronese surface to meet a 
quadric in two lines; this condition is used in § 8 and in § 13. The dis¬ 
criminant of any quadric which contains a Veronese surface is, when the 
surface is given parametrically in canonical form, a determinant which has 
appeared in other writings; references to these have caused the writing of- 
§§ 9 and 10, §9 consisting of some remarks upon a paper of Sylvester’s. 

It is in § I x that the quadric which is associated with a plane quartic 
curve and is butpolar to the Veronese surface is introduced, and the 



142 


W. L. Edge, Some Remarks occasioned by the 

quartic curve and this associated quadric are the core of §§ 11-16. The 
results obtained for the quartic curve could be extended to primals of any 
even order in space of any number of dimensions, and the work has been 
written out with an eye to such extensions. But the paper is sufficiently 
long, so that matters have been left, for the lime being, as they stand. 

Let it be emphasised that any algebraic result which appears here does 
so in consequence of studying the geometry of the Veronese surface. 

I. 

1. A point of a plane o may be identified by the ratios of three homo¬ 
geneous co-ordinates x lt x 2 , x z and a line by the ratios of three homogeneous 
co-ordinates u lt u t , u z . The equation 

a 1 x 1 + u i x 2 + u s x 3 = 0 (i.t) 

may then be interpreted in either of two ways; it may imply that the point 
whose co-ordinates x lt x 2 , x s vary subject to 1.1 lies on the fixed line (u x , 
« 2 , w s)i or else that the line whose co-ordinates u }) « 2 , » a vary subject to 
1.1 passes through the fixed point {x lt x 2 , #3}. It is usual to denote a line 
by the row-vector u formed by its co-ordinates written in their proper 
order, 

while a point is denoted by the column-vector formed by its co-ordinates 
written in their proper order, 

/* A 

*“1*2 l“(*X, * 2 , x sY={ x l, X i, *)}, 

the dash indicating transposition of row into column, an operation some¬ 
times indicated, in order to save vertical space, by writing the co-ordinates 
in a horizontal row but enclosing them in the brackets { }. The ordinary 
row into column rule for multiplication of matrices then gives 

ux = u 1 x 1 + u 2 x 2 + u 2 x % . 

2. Now take the squares and products of the three co-ordinates x f and 
write, t being a square root of 2, 

x ~ tx 2 x Zj TX^Xfo . ( 2 . 1 ) 

As the point x varies in a the point x m , having six homogeneous co¬ 
ordinates, varies in a five-dimensional space 2; it is, however, constrained 
to lie on a certain surface whose points are in (1, 1) correspondence with 
those of cr, a point of the surface and a point of 0 corresponding to one 
another when they have the same values for the mutual ratios of the x v 



Geometry of the Veronese Surface 


H 3 


More generally we may take the surface F in S such that the 
co-ordinates of a point of F are the six constituents y u y 2i y 3 , y 4 , y 5 , y 6 of 
the column-vector 

y = Mx^ 2 \ (2.2) 

where M is any non-singular matrix of six rows and columns; it is pre¬ 
sumed to be non-singular because otherwise there would be a linear 
relation vy = o connecting the constituents of y y and the surface would then 
lie in a prime, or four-dimensional space, of S. 

Any prime vy = o meets F in a curve whose points correspond to those 
of the conic vNlx^= 0 ) since two conics in a have four intersections there 
are four points common to F and two primes. Thus F is a quartic surface, 
having its prime sections represented by the conics of a; it is the well- 
known Veronese surface, whose properties are classical (Veronese, 1884; 
Segre, 1885; Bertini, 1923). 

3. The lines ux=o of or correspond to a set of curves on F which, by 
analogy with the lines of a, have the properties that any two of the curves 
have one common point, while through any two points of F there passes 
one and only one of the curves. Since a line ux=o and a conic vMx^—o 
in 0 have two intersections, the curves on F are met by the prime sections 
in two points, and so are conics. The doubly infinite set of planes in which 
these conics lie will be called the secant planes of F. 

I f ux =o and y = then 

{u x . T^ 1 u 2 )M" 1 y~o ) 

(. u 2 . t- x u z . r~ l U))M~ l y = o, (3.1) 

(. . u z t~ 1 u 1 . )M~*y ~ o. 

All points of that conic on F which corresponds to points of <7 for which 
ux =:o satisfy these three linear equations, which are therefore the equations 
of primes containing the secant plane and so, being linearly independent, 
determine this secant plane. 

A prime vy=o meets F, in general, in a rational quartic curve corre¬ 
sponding to a conic of a. But if the conic of a is the line ux=o taken 
twice, v must be such that the result of substituting the value of y given by 
2.2 in vy is a constant multiple of the square of ux * We may suppose, 
absorbing this constant multiple into v , that vy is equal to the square of 
ux, and this is so when, and only when 

u z 2 , ru 2 u z , ru z u 1} TU x ufMr x — (3.2) 

These primes will be called the tangent primes of F; each of them meets F 
in one of its conics taken twice, and 3.2 is a parametric representation of 



144 


W. L. Edge, Some Remarks occasioned by the 

them just as 2.2 was of the points of F. It is these primes that are the 
“spazi tangenti doppi” of Veronese and Segre. 

We have considered points of F for which * satisfies 1.1 with fixed u; 
it is natural to consider also tangent primes of F for which u satisfies x.i 
with fixed x. If ux=o and v=tt®Mr 1 , then 

vM{x x . . . r~ 1 x s r -1 ^} = o, 

vM{ . x 2 . t _1 x 3 . T~ 1 a: 1 } = o, (3.3) 

vM{. . x 3 r-% . } = 0) 

three equations having an interpretation dual to that of the equations 3.1; 
they are linearly independent and so determine a plane, this plane being 
common to the primes which touch F along the respective conics which 
pass through the fixed point x of F. Now this plane is the one which 
contains the three points given by 


y— M{*! • . r 1 x 3 

y = M{. x 2 . r-% . t-%}, (3.4) 

y = M{ . . X 3 T _ 1 X 2 T~' i X l . }, 

and these, being obtained by the polarisations of 2.2, all lie in the tangent 

plane of F at * (cf. Edge, 1938, p. 473), which is therefore the plane 
determined by 3.3. 

4. The tangent plane at a point of F is determined by the three points 
34 , while that at a second point of F is determined by the three points 


The point 


^ = M{& . . . r-^3 

J = M{. f, . r*£ t . 

y = M{. . g s t -^ 2 r-i & . }. 


(4.1) 


y~ M{*A, r-^x^ + x.Q, T-'(x s g 1+Xl ^), r*(xJ t + Xt £,)} ( 4 . 2 ) 

is manifestly linearly dependent both on the points 3.4 and on the points 
4 -X, it therefore belongs to both the tangent planes. Whence, although 

7 ° ^ ° QOt mCet “ general > pair of tangent planes of F 

has a point of intersection. Now the determinant 

TX ^ 1 r~\xf 2 + x£P) t ~\x£ x + xf s ) 

"Iff T ^ 2 ^4 + ^ , ( 4 . 3 ) 

T ^1+^3) ^(x^yx^Q tx ^ 3 

being a numerical multiple of the discriminant of 

(*!*! + + ****)(«!& + « 2 | 2 + u 3 g 3 ), 


(44) 



Geometry of the Veronese Surface 145 

must vanish identically; hence the equation of the locus of the intersection 
of pairs of tangent planes of F is 


ryi 

7e 

n 

7e 

T 72 

r* 

rs 

r4 

773 


where the y, are the six constituents of the column vector The 

locus is therefore a cubic primal C. 

If the point £ were to coincide with the point x, then 4.2 would 
be the point M* [2] and so a point of F. But then 4.4 would be a 
perfect square, so that every first minor of 4.3 would vanish. Hence F 
is a double surface on C. Every chord of F must therefore lie on C as 
meeting it in at least four points, from which it follows that all the secant 
planes of F lie on C and also, since the chords of F include its tangents, 
that all the tangent planes of F lie on C. 

It is the secant planes of F that afford the projective generation of C 
indicated by the determinant 4.5; this determinant is in fact obtainable 
at once by eliminating u lt u 2 , u 3 from 3.1. 

5. Let Y be any quadric of 2 ; its equation is 

y'Ay = o, 


where A is a symmetric matrix, of six rows and columns. Every plane of 
S meets Y in a conic, and for certain planes of 2 this conic is a pair of lines. 
The condition to which the plane must be subjected in order that the conic 
should thus degenerate is obtained ( cf . Bertini, 1923, p. 149), in the 
standard fashion, by bordering A; below by three rows arising from the 
equations of three linearly independent primes through the plane, to the 
right by the three columns obtained by transposition of these rows. In 
particular it follows, on referring to 3.1, that a secant plane of F meets Y 
in a pair of lines provided that the determinant of the matrix 

/ A M' -1 U'\ 

yuM -1 . j 

vanishes, where 

/«! . . • T-%3 T~ X U z \ 

U=| . U 2 . t _1 «3 . t~ x u x 1. 

\ . U 3 t-%2 r~ x u x . / 


Since M is non-singular this matrix is equal to 


M'" 1 


f„“ T:‘,). 



146 W. L. Edge, Some Remarks occasioned by the 

where I is the unit matrix of three rows and columnsso that the condition 
is 

M'AM U' =0. 

U 

Those secant planes which meet V in pairs of lines thus correspond to 
lines belonging to an envelope in a of the sixth class. 

6. Now let us arrange, as we always may, that M, through the agency 
of a collineation, is replaced by the unit matrix; then the points of F are 
given by 

y=*x^ (6.1) 

and the tangent primes by 

v- tP^, (6.2) 

The point of intersection of two tiftigent planes of F is now 
y~{ x l£v x 2^2> x S^3> r 1 ( X S^3 + a: s£a)> r ~^( x 3il (6.3) 

while the equation of C is 


Vi 

ye 

ye 



ye 

ry» 

ye 

*0, 

(6.4) 

ye 

ye 

ry$ 




The first polars of the points of £ with respect to C form a linear system 
of quadric primals of freedom 5; since F is a double surface on C all these 
quadrics contain F. Conversely: every quadric which contains F is the 
polar quadric of some point of 2 with respect to C. For any quadric 
whose equation is satisfied identically by the co-ordinates 6.1 must be a 
linear combination of the six quadrics 

Ve = 2y 2 y 3 y & 2 = 2y a y 1 y* = 2 y 1 y i 

nye = ry l y i y e yt = ry 2 y s y t y B - ry 3 y e , 

while the polar quadric of the point y — rj with respect to C is 

■ -y/> + Vi(. a y»y.i -y*) + -y # 2 ) + rji(ry t y s - 2 y 1 y i ) 

+ - ^2^5) + - 2y s y i ) = o. (6.5) 

7. A quadric which contains F, and so is a polar quadric of C, will 
be denoted by the symbol Q. Take any point of C which lies on Q and 
not on F; there is a secant plane of F passing through this point, and so 
meeting Q in the point as well as in a conic of F; the plane therefore lies 
entirely: dn Q. Hence the intersection of Q and C consists wholly of 
• secant planes of F, and the sextic threefold K 8 « common to Q and C is 
generated by a Singly infinite family of secant planes. 

V -<s°®ics in : which F is met by these planes of K 8 « have an envelope 



Geometry of the Veronese Surface 147 

which, we prove, is the section of F by the prime which is the second 
polar, with respect to C, of that point of which Q is the first polar. 

Let P be any point of F; the tangent cone of C at P is * the quadric 
cone r by which F is projected from T, its tangent plane at P, and the 
tangent primes of T are those tangent primes of F which pass through T. 
The polar quadric Q of any point A passes through P, and its tangent 
prime at P is f the polar prime of A with respect to T ; this prime passes 
through T and so meets F in a pair of conics, these conics being those two 
which pass through P and have their planes on K 3 6 . P is on the envelope 
of the conics when those two conics which pass through P and have their 
planes on K 3 6 coincide; in order that this should happen the tangent prime 
of Q at P must also be a tangent prime of F, and so of F. Wherefore the 
polar prime of A with respect to F must be a tangent prime of P, and A 
must lie on F. But F, the tangent cone of C at its node P, is the first polar 
of P with respect to C, so that the first polar of P must pass through A. 
Hence the second polar of A must pass through P, which is what we 
desired to prove. 

Those secant planes of F which lie on Q are thus identified. If the 
equation of Q is 6.$, then the secant planes which lie on Q are those meeting" 
F in conics which touch the section of F by the polar prime of y=zyj with 
respect to Q. The equation of this polar prime is 

y&vzv* - V) +yz( 2 vsvi - V) +y&( 2 ViV2 - v*) 

+yi(TVsVe - 2 ViVi) +ys( T V»Vi - 2 Ws) +ye( r ViVs - 2 VsVs) = ( 7 -i> 

The section of F by this prime clearly corresponds to the conic in a whose 
equation is . 

r Vi Vs Vs x i 

Ve TVz Vi x z 

Vs Vi T Vs x s 

JCj 0 C 3 o 

and whose line-equation is therefore 

77 i&f x 2 4 * 7 + rj z U 3 2 + T77 4 «2&?3 + T 7 } 5 u 3 u x + T 7 J = o; (7,3) 

the lines belonging to this envelope in <7 correspond to those conics on F 
whose planes lie on Q. 

* The cone must have the tangent plane of the double surface of C for its vertex, and 
must contain every line joining P to another point of the double surface. 

f The tangent lines of the first polar of a point P with respect to a primal, at any 
point which is common to the polar and the primal, generate the locus which is the first 
polar of P with respect to the locus generated by the tangent lines of the primal at the 
point. * 'X 

P.R.S.E.—VOL, LXI, A, I94I-42, PART XI. s IO 






it is the most general type of matrix for which x'^Hx^ is identically zero. 

If H is now bordered, in the manner of § 5, by the rows U and the 
columns U', the resulting determinant, when equated to zero, gives the 
condition for a secant plane of F to meet Q in a pair of lines. But every 
secant plane of F meets Q in a non-degenerate conic, namely, the conic 
in which the plane meets F; thus if the secant plane is such that the 
determinant vanishes it must be one of those which lie entirely on Q. But 
it has just been seen that those secant planes of F which lie on Q are those 
which satisfy 7.3, and no others; the inescapable conclusion is that, for a 
quadric Q, the sextic envelope mentioned at the end of § 5 must be the cube 
of the conic 7.3* The determinant must therefore be a numerical multiple 
of the cube of the left-hand side of 7.3; what precise multiple it is is seen 
at once by comparing the coefficients of rj x 3 % 6 . 

Thus we have established the identity 

^3 V* - Vi • • % 

• Vi -*?5 • «« 

t?2 Vi • - n« • % 

“ Vi • « ~Vi T ~ lr )e T ~ l7 ?s * r-% 2 

-775 . t- 1 ^ -~r] z r-'rji t -%3 . T~~ l u x ( 8 . 2 ) 

- Vq r ~ lr )5 T ~* lu l 

% T~" l u z T~ J U 2 

U % . T” 1 ^ . 

U z T" 1 ^ .... 

S + 7 ?2 W 2 2 + ’W 2 + T Vi U 2 U Z + rr \$ U 2 U l + T77 6 Z^ 2 ) S . 

There is, however, more to be said about the determinant on the left of 
8.2 than the mere identity alone tells us; for the secant plane of F not 
merely meets Q in a pair of lines, but lies entirely on Q, A reference to 
any of the textbooks, where the condition for a linear space to touch a 
quadric is obtained as the vanishing of a bordered determinant, will show 
that the argument employed can be extended to prove that the necessary 
and sufficient conditions for the space to meet the quadric in a cone with 



Geometry of the Veronese Surface 


149 


an [a] for vertex are that the rank of the determinant should fall below its 
full rank by a- i-1; ordinary contact, such as a plane meeting a quadric 
in two lines, corresponds to the case of a— o. Hence, for a secant plane of 
F which lies on Q, the rank of the determinant in 8.2 falls to 6, and the 
expression which appears cubed on the right is a factor of every seven-rowed 
minor. 

9. The determinant | H | made its first appearance precisely a century 
ago (Sylvester, 1841, p. 232), in a slightly different form without the 
factors r. It is not written down by Sylvester as a determinant, but the 
reader is left to observe its structure from the coefficients in six equations. 
Later, in 1856, Cayley encountered it again and observed that it was a 
numerical multiple of the square of the determinant 


h | = 

T7 ?l 

V* 

Vs 


Ve 

T Vz 

Vi 


Vs 

Vi 

?Vz 


Cayley says that it would be desirable to have an a priori demonstration 
that | h | occurs as a squared factor in | H | ; the natural setting for such 
a demonstration is the geometry of the Veronese surface, but before 
speaking of this it -will be fitting to pass one or two remarks on Sylvester's 
paper, which is historic for the reason that it is there shown for the first 
time how the resultant of three conics may be written down. 

Sylvester, before giving the process for obtaining the resultant of any 
three conics, gives three particular examples; it is in the first of these that 
I H | occurs. In the present notation, Sylvester’s proposal is to obtain 
the resultant of the three equations 

Vz X £ + rj 2 X 3 2 - T7J 4 # 2 X 3 = 0,1 

Vi x s 2 + Vz x * “ r Vs x z x i = o, j (9* i) 

Vz X I 2 + - TTJtX&z = 0 J 

The coefficients of the squares of the variables were chosen in this way 
so that the dialytic process of elimination might the more easily be applied. 

If the three equations are multiplied in order by x-f, - ar a 2 , - x z 2 and 
added, the result being then divided by rx 2 x 3) and if the corresponding 
processes derived by cyclically permuting the suffixes are also carried out, 
the three equations 

- ritpcf - T 7 } 1 x 2 x B + rjQX^ + tj^XjX 2 = o, 1 

- t] s x 2 2 - rrj^x^ + r\tx x x 2 + r\ z x^c % = o, V (9.2) 

- V3 2 - T Vl^2 + ^¥3 + 1 l4¥l =s0 )) 

are obtained. 



i jo W. L . Edge , Remarks occasioned by the 

The conics represented by these equations do not belong to the net 9.1; 
but, from the way in which they have been derived, they all pass through 
any base point the net 9.1 may have; unless such a base point lies on a side 
of the triangle of reference—a proviso which Sylvester, in his enthusiasm 
for the application of his new process of dialysis, overlooked. Thus, apart 
from the proviso, the necessary condition for the net 9.1 to have a base 
point is obtained by the elimination of 

X l 2 > X 2 2 ) #3 2 , T#2#3j TX 3 X v TX±X 3 

between the six equations 9.1 and 9.2; the result of this elimination is 
precisely 

I H | = o. 

Sylvester then states incorrectly the value of | H | ; Muir says that this 
blunder is unaccountable, but the reason for it is the neglect of the proviso 
just mentioned. A glance at 9.1 shows that, if tj x were to vanish, x x would 
be a factor of the left-hand sides of the second and third of the three 
equations, which would then certainly have a common solution. Thus r) x 
must occur as a factor in the resultant, and so, by parity of reasoning, must 
r } 2 and rj 3 . Knowing that the resultant must include the factor 
Sylvester jumped to the conclusion that this factor must also occur in 
| H | ; presumably he believed that the value of | H | was rj x r] z 7 ) z | h | , 
and his wrong form for | h | is a slip. He was, of course, perfectly 
correct in assuming that rj 1 7 ] 2 r} z occurs as a factor of the resultant; but 
I H | is only a part of this resultant. The resultant of three conics is of 
degree four in the coefficients of each of them, as indeed appears at the 
end of this very paper of Sylvester's; thus the resultant of the equations 
9.1 is of degree 12 in the rf s, whereas | H J is only of degree 6. The 
resultant is actually | h l 2 , while | H [ = - £ ] h | 2 . 

The fact that their resultant is a perfect square must be a consequence of 
some peculiarity of the equations 9.1, and it is interesting that, in selecting 
the coefficients in 9.1 so that further quadratic forms 9.2 could be found to 
combine with them, Sylvester hit on a set of equations whose peculiarity is 
best explained in geometrical terms. For each of the three equations 
represents a pair of lines, and the six lines constituted by the three line- 
pairs are all tangents of the same conic; indeed the equations are those of 
the pairs of tangents from the vertices of the triangle of reference to the 
conic 7.3. Thus if the three line-pairs have a common point O three lines 
of the conic-envelope 7.3 pass through O; this envelope must then consist 
of two points O, O' and the remaining three lines, one of each pair, pass 
through O'. Thus the peculiarity of the equations 9.1 is that if they have 
one common solution they have two; this is the reason why their resultant' 



Geometry of the Veronese Surface. 151 

is a perfect square. Incidentally the condition for 7.3 to be a point-pair 
is | h | =0. 

10. The determinant | H | was also encountered by Pasch (Pasch, 
1891, p. 46), who constructed it as the Hessian of | h | regarded as a 
cubic form in six variables. The geometrical interpretation of this in the 
space S is that, | h [ =0 being the equation of C, J H | =0 is the equation 
of that sextic primal which is the Hessian of C. We now give a geo¬ 
metrical proof that the Hessian of C is simply C itself, taken twice over. 

The Hessian of a primal, in space of any number of dimensions, is the 
locus of those points whose polar quadrics with respect to the primal are 
cones; if a point is such that its polar quadric is a cone with a line-vertex 
the co-ordinates of the point cause all the first minors of the Hessian 
matrix to vanish, so that the point is a double point of the Hessian. Our 
object will thus have been achieved if it can be shown that the polar quadric 
of any point on C is a cone with a line-vertex; for it will then follow that 
every point of C is a node of its Hessian and therefore that the Hessian, 
since it is of the sixth order, consists of C taken twice over. 

That the polar quadric of a point of C is a line-cone was proved by 
Segre (1885, p. 497); an alternative proof is as follows. 

Let A be a point of C. The polar quadric Q A is the first polar of A 
with respect to C; it has the same tangent prime as C at A and contains 
all those lines which pass through A and lie on C. Thus it contains those 
two tangent planes of F which pass through A. Let Tj, T a be the points 
of contact of these planes with F. 

Any secant plane which passes through T x is met by Q A both in the 
conic of F lying in the plane and in the line of intersection of the plane 
with the tangent plane of F at T^ hence the plane lies entirely on Q A . 
This applies to every secant plane which passes through T x , so that all 
the chords which join T x to the other points of F lie on Q A . Wherefore, 
since these chords do not all lie in a prime (for if they did F would do so too), 
Q a must have T-, as a node. Similarly, it has T 2 as a node, and so must be 
a cone with the line TjT 2 for vertex. Hence the polar quadric with respect 
to C of any point A of C itself is a line-cone , the vertex of the line-cone 
lying on C and joining the two points where F is touched by those two of 
its tangent planes which pass through A. 

The preceding argument shows, first, that | H | must be a numerical 
multiple of the square of | h j and, secondly, that, as was noticed by 
Pasch, the rank of H falls to 4 whenever | h j =0. 

Incidentally, when \h \ =0 the envelope 7.3'consists of a pair of 
points; hence the locus K 3 e generated by the secant planes of F which lie 
on Q a must consist of two distinct cubic primals. It is clear, from the 



* 5 * 


W. Z. Edge , Some Remarks occasioned by the 

descriptive argument above, that it does indeed consist of those cubic 
cones which project F from T\ and T 2 respectively. 

II. 

ii. Let <j>(x Xi # 2 , x 3 ) be a homogeneous quartic polynomial in x 1} x 2 , x 3 . 
Take the six expressions 

3 2 < f > d *< f > d 2 ( j > d 2 cj > d *< f > ^ 

dx-f dx£ dx^ dx 2 dx 2 dx$dx x dx x dx 2 

these are all homogeneous quadratic polynomials, and so are linear com¬ 
binations of the six constituents 

x x 2 , x 2 2 , # 3 2 , rx 2^ 3 , rx z x x , rx x x 2 

of x l2 \ Arrange the six quadratic polynomials, in this order, vertically 
beneath one another; the coefficients of the constituents of x C2] then form 
a symmetric matrix, which is in fact 

a k g ra rq' 

h h f Tp rf$ 

r f c Tp’ rq 

ra rp rp f 2 f 2 y 

r^' TjS 2y 2£* 

rr r/ ry 2jS 2a 

where again r 2 = 2, and where the other fifteen letters denote the coefficients 
of the terms 

pS-xixjxf, where i+J + k= 4, 

l\j\ K\ 

in Then 

< f > = 

The equation <£=o represents a quartic curve 8, and, in the (i, i) 
correspondence between the points of o* and those of F, 8 corresponds to 
that curved which is the intersection of F with the quadric ^/zjy=o. But, 
while A is uniquely determined when 8 is given, there are other quadrics 
passing through it, every quadric jj?(ju + H)y =o containing A for every H 
of the form 8.1; the two equations 

represent the same curve 8 because x^HxP ' 1 is identically zero. 

The quadric j?f*y==o, whose matrix is the same as that formed by the 





Geometry of the Veronese Surface 153 

coefficients in the second polars of <f >, has, however, a special property; it 
can thereby be identified, among' the, quadrics passing through J, without 
the equation <£=o being previously given, and without any reference to cr. 
For the quadrics which pass through A form a linear system of freedom 6, 
and if a quadric of this system is subjected to the six linear conditions of 
being outpolar to all those quadrics which are inscribed in F (i.e. which 
touch all the tangent primes of F), a unique quadric is thereby obtained, 
which is precisely the quadric yfxy—o. This result is due to Segre (Segre, 
1892, p. 240); it is easily obtained by appealing to the special forms of the 
matrices H and \i. For, the parametric form for the tangent primes of*F 
being given by 6.2, the matrix of the quadratic form in the prime co¬ 
ordinates u which corresponds to any quadric inscribed in F must be of 
the form H. The condition for such a quadric to be inpolar to y^y~ o is 
that the trace of the product matrix ju,H, or, what is the same thing, that of 
H fi } should vanish. This trace is found by taking any row of p and the 
corresponding column of H, forming their vector product, and adding the 
six products so arising; a reference to 8.1 and 11.1 shows that this sum 
does in fact vanish identically. 

12. Among the 00 5 quadrics which contain F we have noticed, in § 10, 
00 4 line-cones, every chord of F being the vertex of such a cone. We 
also noticed, in § 7, 00 2 plane-cones, namely, those cones which project 
the surface F from its tangent planes; the tangent primes of such a cone 
are those tangent primes of F which pass through that one of its tangent 
planes which is the vertex of the cone. 

So, dually, among the 00 5 quadric envelopes that are inscribed in F 
there are 00 2 whose discriminants have rank 3 instead of their full rank 6; 
the primes which belong to such a quadric envelope consist of those which 
touch one of the conics on F. Any quadric which is outpolar to F must be 
outpolar to all these degenerate quadric envelopes; thus, in accordance 
with the usual interpretation of the vanishing of invariants of two quadrics 
when the discriminant of one of them falls below its full rank, any secant 
plane tt of F meets any outpolar quadric in a conic which is outpolar to the 
conic of F which lies in 7r. 

This being so, let T x , T 2 be two points of F that are conjugate with 
respect to an outpolar quadric Y; the secant plane 7 r which contains the 
conic y of F passing through T x and T 2 meets Y in a conic ^ outpolar to y. 
There must then be a triangle inscribed in p which is self-polar for y 
and has T{T 2 for one of its sides, so that the tangents of y at T 2 and T 2 
intersect on Whence 

the tangent planes of F at any two points which are conjugate with 

respect to an outpolar quadric meet on this quadric . 



154 W. L. Rdge, Some Remarks occasioned by the 

The intersection of the two tangent planes of F also lies on C, 
and the argument is easily reversed to show that the two tangent 
planes of F which pass through any point common to C and an 
outpolar quadric have their points of contact conjugate with respect 
to this quadric. 

13. Consider now those secant planes of F which meet an outpolar 
quadric Y in pairs of lines; it follows from §5 that they are obtained 
by bordering ft by three rows and columns. They can also be identi¬ 
fied by geometrical reasoning, and the combination of the two 
results furnishes an interesting form for a well-known contravariant 
of a ternary quartic. 

Let a secant plane 77 of F meet T in a pair of lines; 77, like all other 
secant planes of F, meets A , the curve common to F and Y, in four points 
P x , P 2 , P 3 , P 4 which correspond to four collinear points of 8. The points 
of the conic in which F is met by 77 are in (1, 1) correspondence with the 
points of the line on which the four points of 8 lie. Now, since A lies on Y, 
the two lines in which 77 meets Y must join the four points of A in pairs; 
let the joins be P X P 2 and P 3 P 4 . Since P X P 2 lies on Y, P x and P 2 are two 
points of F that are conjugate with respect to Y; the tangent planes of F 
at P x and P 2 therefore, by § 12, meet on Y. But this point of meeting lies 
in 77 , being the intersection of the tangents to the conic of F at P x and P 2 , 
and the only points of Y which lie in 77 and not on P X P 2 lie on P 3 P 4 ; hence 
the tangents to the conic at P x and P 2 must meet on P 3 P 4 . Similarly the 
tangents at P 3 and P 4 meet on P X P 2 . Wherefore P X P 2 and P 3 P 4 are 
conjugate chords of the conic, and the four points in which the conic meets 
A form, on the conic, a harmonic range. 

Thus, those secant planes which meet Y in pairs of lines correspond to 
those lines of a which are cut in harmonic ranges by the quartic curve <f> =0. 
It was. seen in § 5 that the envelope of the lines in a which correspond to 
secant planes of F that are touched by any quadric is of the sixth class; 
when the quadric is outpolar to F this envelope is the harmonic envelope 
of a quartic curve. 

It follows that when is bordered by the rows U and the columns U' 
the determinant of the resulting matrix gives, when equated to zero, 
the harmonic contravariant of <f> . 

Let us give two simple instances of this, by way of illustration. 

If 

X 2> x s) S Xf + Xf + Xf, 

8 is the curve studied by Dyck and admitting a group of 96 linear self- 
transformations. The equation of its harmonic envelope is 



Geometry of the Veronese Surface 


155 


x . . % * • = o, 

I . . . u 2 

I . . . u z 

r^U z . 

% - T -1 & 3 r“ 1 Z < ? 2 

. U 2 . ‘ST 1 *^ - T" 1 ^! 

u z r~ X U 2 T -1 ^ .... 


which gives clearly 

- i . t “" 1 ^ 3 t~~ x u 2 • — °> 

I . r~% 3 . r~ X U x T - 1 U Z . T” 1 ^ 

. I T" 1 ^ T'" 1 &f 1 . T" 1 ^ r~ 1 u 1 

or 

ufu^uf 1 = o. 


Thus the harmonic envelope of Dyck’s curve consists of the three 
vertices of the triangle of reference, each counted twice. These points can 
of course be identified geometrically, independently of any co-ordinate 
system to which the curve may be referred; the curve has twelve points 
at which its tangents have four-point contact, and these twelve tangents 
are concurrent, in three groups of four, at the points in question. 

Secondly, take 

<f>(x 1, # 2 > X z) = 4(^2 3 ^3 + ^ X 1 +« 1 X 2)> 

so that 8 is the curve studied by Klein and admitting a group of 168 linear 
self-transformations. The equation of its harmonic envelope is 

r u x . . =o. 

T ... ti f 2 . ' 

T . . U z 

. T . . . T~~^Uq T~ 1 U 2 

T . T"" 1 ^ . 

T . . T~ X U 2 T'~ 1 U l 

U x . T~ X % *T~~ 1 U 2 ... 

U 2 . T~ X ^3 . T” 1 ^ 

. U z T ~ X «2 


On expansion the known form 


for the harmonic envelope is found. 





156 W. L. Edge , Some Remarks occasioned by the 


14. The determinant | ^ | is well known as an invariant of a ternary 
quartic; when [i is bordered by the single row ^ [2] and the transposed 
column the ensuing determinant is a contravariant which was found by 
Clebsch in 1861. That this bordering of p gives Clebsch’s contravariant 
was observed, for the dual case of a ternary quartic in the line co-ordinates, 
by Scherrer in 1882; that the bordering by U gives the harmonic contra¬ 
variant does not seem to have been perceived before. 

It seems fitting to give the simple direct proof that these functions so 
formed are actually invariantly related to the quartic curve; the proof is 
immediate once the theorem of § 15 has been established and the equation 
16.2 written down. The method of proof clearly admits of immediate 
extension to forms of any even order in any number of variables. 

15. Suppose that the point co-ordinates are subjected to the linear 
transformation 


I*A 

II 

\ m 11 

m X2 

m 



**21 

m 22 

^3} (4 

' X 3 i 

| 1 

\ m zi 

w 32 

^33 \ £31 


or 

oc = mg. (15.1) 

This induces the transformation 


= m l 


(X 5 * 2 ) 


the second induced matrix of m being defined by 15.2. 

The transformation of differential operators consequent upon 15.1 is 


D* 


Js 8>_ s\ fa d_ jn 

W W a*.’ 8*J x 


whence it follows that 

a 2 


D # m. 


IJI 

'W* 


ssr 


d*_ 


a 8 


s a 


H&i tf*a£x H- 


a 8 \ 


iff” D.W 


(15-3) 


(154) 


Now we have the equations 

where v is the matrix formed from the coefficients, of the form into which 
^(x 1} x s , x 3 ) is turned by the transformation 15.1, in exactly the same way 
as ft is formed from the coefficients of <f> before it is subjected to the trans¬ 
formation. But the first of these equations is, by 15.2, 

i J 2'D a . [3] ^=^ [3] p : ; (15.5) 

while the second is, by 15.4, 


(15-6) 





Geometry of the Veronese Surface 


*57 


Comparison of 15.5 and 15.6 shows that 


v = 


(i5*7) 


and so the following theorem has been established: 

If p is a matrix of the form shown in 11.1, and m any matrix of the 
third order , then r/P^pmP^ is of the same form as p, where r(P^ is the 
transposed matrix of m^, the second induced matrix of m. 

The work of § 16 would have no validity had this theorem not been 
established. 

The proof of the theorem may be carried over almost verbatim to the 
case of the p th induced matrix of m\ if p for the moment denotes the 
matrix of the coefficients in the |(p +1 )(p -f 2) p th polars of a ternary form 
of even degree 2 p, then mWpmPl is of the same form as p. And the 
corresponding result holds for forms of even order in any number of 
-variables, the order of m being equal to this number of variables. For 
quadratic forms the theorem becomes the trivial one that, if p is symmetric, 
so i§ fhpm , where m is any matrix of the same order as p . 

16. Suppose now that m is non-singular, so that, when the point 
co-ordinates are subjected to the transformation 15.1, the line co-ordinates 
are subjected to the contragredient transformation 

0 *i, 2*2, 2*3) = (^i> ^2, 

u — com -1 . (16.1) 


It is a known property of th ep th induced matrix of a matrix m of order 
n that its determinant is (the result being attributed by Muir to Schlafli, 


1851) the determinant \m ( raised to the power 
equating the determinants of the two sides of 15,7, 


(p+n-i)! 
(p - 1)! nl 


Hence, 


[ v | — | nP | | p j [ | = | ^ | 4 | ft | | | 4 = | fA | | m 


showing that | p | is an invariant, of weight 8, of the ternary quartic <j >. 
Consider next Clebsch’s contravariant 



Since, by 16.1, oj — u?n J aP —u^m^. Hence 
W? . J~\uV . y 




I $8 PF. Z. Edge , Remarks occasioned by the 


the matrices being partitioned conformably, so that I denotes a single 
unit. Taking determinants gives 


v 

at** . 


= m 


JW 





thus directly establishing the contravariance and displaying the weight 8. 
Now let us take the form found for the harmonic envelope of <£, namely 


and establish its contravariance directly. This is easily achieved if it is 
observed that 


or 


(#1» U z) ( U 1 


T ~ l U z T"~h 


r ~ lu l 


u z r^u-y 

= u 2 2 , u 3 % ru 2 u B) ru z u l9 ru x u^ y 

u\J = iP ^ 7 


(r6.2) 


and so also, if Q denotes the matrix formed from o> in the same way that 
U is formed from u , 

coQ=to [2] . (16.3) 


But this last equation is 

um£l = 


and so, substituting for %P^ from 16.2, 

m£l^Um [2 \ giving 

It is now seen immediately that 

/ v Q'\ f nP*\wP* 

\£J . ) \mr l \JffP* , J 

' . V/x U'\/W 23 * \ 

w\u .A. rh^r 


and so, taking determinants, 


v 

a 



/X U' 

u . 



/X U' 

u . 


This establishes the contravariance, and is in accordance. with the 
known fact that the harmonic envelope is of weight 6, its Aronhold symbol 
being (bcuf(cau)\abu)\ 



Geometry of the Veronese Surface 


159 


REFERENCES TO LITERATURE. 

Bertini, E., 1923. Introduzione alia geomeiria proiettiva degli iperspazi 
Seconda Edizione, Messina. 

Cayley, A., 1856. “Note upon a result of elimination,” Phil. Mag., vol. xi, pp. 
378-379; Collected Mathematical Papers, vol. iii, pp. 214-215. 

Clebsch, A., 1861. “ Uber Curven vierter Oidnung,” Journ. reine angew. Math., 
vol. lix, pp. 125-145 (139). 

Edge, W. L., 1938. “Notes on a net of quadric surfaces (III),” Proc. London 
Math. Soc. (2), vol. xliv, pp. 466-4S0. 

Muir, T., 1906 and 1911. The theory of determinants in the historical order of 
development , vol. i (London, 1906), p. 244; vol. ii (London, 1911), p. 52. 

Pasch, M., 1891. “Uber bilineare Formen und deren geometrische Anwen- 
dung,” Math. Ann., vol. xxxviii, pp. 24-49. 

Scherrer, F. R., 1882. “Uber ternare biquadratische Formen,” Ann. Mat. 
pur a appl . (2), vol. x, pp. 212-223. 

Segre, C., 1885. “ Considerazioni intorno alia geometria delle coniche di un 

piano e alia sua rappresentazione sulla geometria dei complessi lineari di 
rette,” Atti Acc. Torino, vol. xx, pp. 487-504. 

-, 1892. “Alcune idee di Ettore Caporali intorno alle quartiche piane,” 

Ann . Mat.pura appl. (2), vol. xx, pp. 237-242. 

Sylvester, J. J., 1841. “Examples of the dialytic method of elimination as 
applied to ternary systems of equations,” Cambridge Math. Journ., vol. ii, 
pp. 232-236; Collected Mathematical Papers, vol. i, pp. 61-65. 

Veronese, G., 1884. “La superficie omaloide normale a due dimensioni e del 
quarto ordine dello spazio a cinque dimensioni e le sue proiezioni nel piano e 
nello spazio ordinario,” Mem. Accad. Lincei (3), vol. xix, pp. 344-371. 


{Issued separately December 16, 1941.) 



i6o 


E. T. Whittaker , Some Disputed Questions 


XIII— Some Disputed Questions in the Philosophy of the 
Physical Sciences. By E. T. Whittaker, F.R.S., P. R.S.E. 

(.Address of the President at the Annual Statutory Meeting , 

October 27, 1941.) 

(MS. received October 27, 1941.) 

More than two thousand years ago the Greek philosophers raised certain 
questions, which are still undecided, about the origin and character of 
knowledge regarding the external world. After a period of comparative 
quiet, the discussion has become very active recently, under the stimulus 
of the new discoveries in mathematical physics; and, in particular, a 
lively debate is in progress at the present moment between Sir Arthur 
Eddington and Dr Harold Jeffreys of Cambridge, Professor Milne of 
Oxford, Sir James Jeans, and Professor Dingle of the Imperial College, 
the subject being the respective shares of reason and observation in the 
discovery of the laws of nature. I propose this afternoon to offer some 
remarks on the history and present state of this controversy. 

It is admitted by everyone that the mathematical pre-calculation of 
natural events is conceivable only because the world is rationally made: 
and that the ideal of science is to discover laws and equations sufficient to 
make every physical event predictable, so that mankind will some day 
possess a corpus of purely mathematical relations, capable of representing 
and foretelling every happening in the inanimate external world. Now 
it is characteristic of any branch of mathematics, that the whole of it can 
be worked out from a few definitions and assumptions set down at, the 
beginning. The question therefore arises, what are the fundamental data 
•or postulates from which the complete set of laws of the material universe 
•can be deduced by pure mathematics ? And are these data or postulates 
furnished' to us by the senses—by observation and experiment—or are 
they self-evident truths, revealed and assured to us by intuition? 

This fundamental problem of the philosophy of nature was first 
•conceived and discussed by the ancient Greeks, whose judgment on it 
can be very clearly seen in their treatment of geometry. Geometry is the 
science of spatial relations in the external world and so is essentially a part 
•of natural philosophy: and the knowledge of many geometrical properties 



in the Philosophy of the Physical Sciences 161 

was, doubtless, originally derived from observation: thus according to 
one conjecture, Thales became convinced that the angle in a semicircle 
is always a right angle, by gazing at tiles which showed, as part of the 
ornamental design, a rectangle inscribed in a circle. This earliest stage 
of geometry corresponded to that now generally imposed on school children, 
who are taught to draw figures with ruler and compasses, and to measure 
angles of triangles with protractors, before being introduced to the 
deductive treatment of the subject. When the logical connexion of 
different theorems was established by Pythagoras and his school, the 
idea gained ground that it might be possible to link up the entire science 
of geometry into a chain of propositions obtained by syllogistic reasoning 
from a small number of original premisses. The great question was, 
from whence were these premisses derived? Plato believed that they 
could all be obtained by pure intellection; and Euclid drew up a list of 
five “common notions” or axioms and five “postulates,” from which he 
professed to demonstrate all the results of geometry as’logical conclusions. 
The “common notions,” such as “Things that are equal to the same 
thing are equal to each other,” were presented as self-evident truths 
which belong alike to every science. The postulates, such as “Let it be 
granted that a circle may be described with any centre and diameter,” 
were put forward as admitted feasibilities, peculiar' to the science of 
geometry. Neither for common notions nor for postulates was any 
proof offered: the disciple was expected to know by intuition that they 
were necessary, that things could not be otherwise. Thus the Greek 
philosophers taught that although geometry was a science relating to 
the sensible external universe, it could be built up completely without 
having recourse at any stage to quantitative observation. 

Even in the ancient world there was some uneasiness about the validity 
of this doctrine. One of Euclid’s premisses was the famous “parallel- 
postulate,” “If a straight line, falling on two straight lines, makes the 
interior angles on the same side less than two right angles, the two 
straight lines, if produced indefinitely, meet on that side on which are 
the angles less than two right angles.” This is not a particularly simple 
statement—are we justified in accepting it without proof as true? Some 
of the Greeks, and many later mathematicians down to the nineteenth 
century, were inclined to suspect that, though true, it was not a primary 
truth which could be referred immediately to the intuition, but a theorem 
which could be established by deductive reasoning from simpler „ self- 
evident assumptions. Accordingly they tried to discover some more 
plausible axiom from which it could be derived: an endeavour which met 
with a certain degree of success when in the seventeenth century Wallis 



162 E. 7 "* Whittaker , Some Disputed Questions 

showed that if the existence of triangles different in size but similar in 
shape could be assumed, then the parallel-postulate could be proved and 
Euclidean geometry re-constituted. Even this, however, was not 
altogether satisfactory: for it cannot be claimed that a belief in similar 
triangles is a necessity of human thought. This may easily be shown as 
follows: take a globe such as the earth, and mark any number of points 
on it, such as London, Edinburgh, Stockholm, and so forth; then mark 
on the globe the tracks of airmen flying directly between these points. 
The three tracks joining any three points to each other form what is 
called a “spherical triangle,” and the angles which the tracks make with 
each other at the three points are called the “angles of the spherical 
triangle”—evidently the angles determine the shape of the triangle. 
Now consider a particular spherical triangle formed on the earth in the 
following way: one vertex N is to be at the north pole, and the other 
two vertices A and B are to be on the equator. The sides NA and NB 
will then be meridians of longitude on the'earth, and the side AB will 
be part of the equator. It is evident that the angles at A and B are right 
angles: and hence the sum of the angles of the triangle is equal to two 
right angles together with the angle at N. Now, since A and B are on 
the equator, and N is the pole, the area of the spherical triangle is 
evidently proportional to the angle at N: and thus we have for triangles 
of this kind the result that the sum of the three angles is greater than two 
right angles, by an amount which is proportional to the area of the 
triangle. This result may be shown to be true for any spherical triangle 
whatever—which proves that two spherical triangles of different sizes 
cannot possibly have the same angles, since the angle-sum must be greater 
for the larger than for the smaller triangle. % 

Thus there are no similar triangles in spherical geometry—that is to 
say, the mind can conceive of a type of geometry in which similar triangles 
do not exist. Thus Wallis’s postulate is not something which we are 
compelled to believe by reason of the structure of our minds. 

We are therefore forced to the conclusion that Euclidean geometry 
cannot be deduced from self-evident axioms; and hence the question as 
to whether it is the true geometry of the actual world can only be settled 
by making measurements and examining how far they agree with the 
theoretical predictions. Geometry is a branch of experimental know- 
. ledge. This fact, so important from the point of view of the philosophy 
of the -physical sciences, obtained general acceptance very slowly—the 
belief that the Euclidean system is necessarily true was unquestioned for 
two millenniums, and became finally discredited only in the second* half 
of the nineteenth century. : 



in the Philosophy of the Physical Sciences 163 

We have now to inquire whether the ancient Greeks thought it possible 
that the other sciences might, like geometry, be deduced logically from 
premisses self-evident to the intuition. On this point Aristotle, at any 
rate, was quite clear. “The principles which lie at the basis of any 
particular science,” he says,* “are derived fjom experience ( ifXTreipla ): 
thus it is from astronomical observation that we derive the principles of 
astronomical science.” He poured scorn on what he called aireipla f— 
that is, the state of those who devote themselves entirely to abstract 
reasoning from intuitive postulates, and are indifferent to facts. And he 
insisted on the vital importance of investigating any subject scientifically 
(pvcriiccos )—that is to say, by the study of sensible objects in nature *— 
rather than dialectically (Aoy*/c£?), that is, by pure deduction from un¬ 
proved assumptions. The primary elements of true science must be the 
result of induction (eVaywy??) based on phenomena, proceeding from 
particulars to universals. Sense-perception is indispensable to induction, 
and induction to discovery of the laws of nature. 

The opinions of Aristotle on this subject were accepted by the school¬ 
men who developed his philosophy in the Middle Ages. St Thomas 
Aquinas himself said § that the only secure foundation is experiment, 
and many of the scholastics made accurate observations and rational 
deductions from them. It may seem at first sight surprising that under 
these circumstances the thirteenth and fourteenth centuries did not 
anticipate the brilliant scientific discoveries of the seventeenth and 
eighteenth. The chief reasons were, I think, these. In the first place, 
the outstanding example of Euclidean geometry, at that time the most 
fully developed branch of science, whose results were believed to have been 
obtained without any dependence on observation or experiment, en¬ 
couraged among many of the schoolmen a tendency to neglect the principles 
so clearly set forth by Aristotle and St Thomas, and to concentrate their 
interest on pure deduction; and this was not uncongenial to their general 
outlook, for, recognising in the external physical order an expression of 
the Divine Reason, they inferred that the human reason was the proper 
organ for its investigation. In the second place, it is to be remembered 
that Aristotle was a biologist rather than a physicist; and the biologist, 
at least in the earlier stages of biology, classifies rather than measures. 
The emphasis in Aristotelianism was decidedly on classification; and the 
idea of mathematical precision in quantitative measurements, which is 

* Prior Analytics, I, 30. f 

t De Gen . et Corr., I, 2. 

x Ibid. 

§ Cf. St Thomas, Commentary on Aristotle 9 s a PhysicsP Lib. VIII, Cap. 1, Lect. 3, 4. 

P.R.S.E.—-VOL. LXI, A, I94I-42, PART II II 



164 E. T. Whittaker , Some Disputed Questions 

the fundamental principle of modern physics, scarcely entered into the 
conception of Induction which the Scholastics inherited from the Peri¬ 
patetics. Again, the primitive biologist depends on observation (the study 
of those appearances which nature presents of her own accord) rather 
than on experiment (the artificial production of effects under conditions 
specially selected and controlled by the experimenter). Simple observa¬ 
tion will carry the zoologist a long way: but it is insufficient for the 
purposes, say, of the chemist, who has to disentangle all the antecedents 
and concomitants of the reaction he has observed. Hence Aristotle and 
his followers were much more at home in the biological,than in the 
physical domain. 

The great archetype of modern scientific discovery was the Newtonian 
law of gravitation. How wonderful an achievement it was may be 
seen by comparing it with the concept of nature as a system of vortices 
proposed only forty years earlier by Descartes, in which there is no 
attempt to obtain concordance between the deductions of the theory 
and experimental data. With Newton, on the other hand, we find the 
complete technique of research as it is practised to-day—namely, first 
the accurate measurement of phenomena; secondly, the imagination of a 
physical hypothesis likely to account for them, then the working out by 
mathematics of the detailed consequences of the hypothesis; and lastly, 
the comparison of the results of these calculations with the observations.. 
When we reflect that by this investigation the entire cosmos was for the 
first time shown to conform to prediction based on a mathematical law, 
and that it provided the model and pattern for all subsequent progress, 
we cannot but confirm the judgment engraved below the statue in Trinity 
chapel: Newton , qui genus humanum ingenio superavit. 

From the time of Newton until quite recently the principle that science 
rests fundamentally on observation and experiment has been unchallenged. 
-In order to understand the present revolt against it, let us first consider 
some striking cases in which a mathematician, working in his study 
without any contact with laboratories, has been able to predict the 
existence of wholly new and* unexpected phenomena in the external 
world. 

A typical example is" Hamilton’s discovery of conical refraction. If 
we -mark a dot on a piece of paper, and look at it through a crystal of 
Iceland spar, we see in general not one dot but two: this is because the 
spar has the property^ called double refraction. In 1821 the French 
physicist Fresnel discovered the equation of the wave-surface , or locus 
at any instant of a disturbance generated at a particular point at some 
previous instant, in a doubly refracting crystal; but he did not study 



in the Philosophy of the Physical Sciences 165 

the geometrical features of the surface as a mathematician would do. 
This was precisely what Hamilton did. He found that Fresnel’s surface 
had some remarkable singularities, such as sharp peaks like the vertex 
of a cone, at each of which it had an infinite number of tangent-planes; 
and from the existence of these mathematical peculiarities in the wave- 
surface he inferred the existence of a corresponding optical phenomenon 
of a most amazing kind, namely, that a ray of light within the crystal 
would, under certain circumstances, be divided on emergence into an 
infinite number of rays, constituting a conical surface, while a single ray 
in air, incident on the crystal, might give a cone of rays inside the crystal. 
It should therefore be possible to arrange an experiment so that a dot on 
a sheet of paper, when viewed through the Iceland spar, should appear 
not as two dots, but as a complete circle. This prediction was immediately 
verified. 

Since the beginning of the twentieth century many novel and 
remarkable effects in physics have been discovered by mathematicians. 
Einstein’s predictions of the bending of light-rays by the sun’s attraction, 
and of the red-shift of spectral lines emitted in a strong gravitational 
field, are two notable instances. These were deduced from his general 
theory of relativity, which was entirely mathematical—that is to say, it 
did not originate in, or depend on, any new experiments or observations. 
Yet another striking discovery of the same type was the recognition 
that ordinary hydrogen gas is a mixture of two different kinds of molecules. 
When it is remembered that for a century and a half hydrogen has been 
familiar to every schoolboy beginning science, and has been the subject 
of innumerable experiments by highly trained investigators, the announce¬ 
ment of its composite character came as a great surprise. The mathematical 
reasoning involved turned on a distinction which is made in algebra: 
if we have two algebraic quantities x and y, we can form from them 
certain expressions such as x+y whose value is unaltered when x and y 
are interchanged with each other; these are said to be symmetric in x 
and y. We can also form expressions such as x,-y whose value is 
reversed in sign when x and y are interchanged; these are said to be 
skew in x and y. Now it was found, in 1927, that the mathematical 
equations which represent the conditions of existence of the hydrogen 
molecule possess two different solutions, of which one is symmetric and 
the other is skew. It followed that there must be two different kinds of 
hydrogen molecule, to which the names para-hydrogen and ortho-hydrogen 
were given. These two tautomers behave In exactly the same way as 
regards the formation of chemical compounds, which explains why the 
chemists had never distinguished or separated them; but the specific heat 



166 E. T Whittaker , Some Disputed Questions 

of para-hydrogen is greater at low temperatures than that of ortho¬ 
hydrogen; and their boiling-points and conductivities are also different. 
Hydrogen gas prepared by the usual chemical processes consists of 
one-quarter para-hydrogen and three-quarters ortho-hydrogen. 

We have now to consider whether the doctrine that science must be 
based on observation needs any modification, in view of the fact that 
discoveries in physics of the most unexpected kind and of the greatest 
importance are frequently made by mathematicians who have never 
performed, or even seen, an experiment in their lives. Obviously the 
dominant principle to be taken into account here is that the world is 
rational , the different phenomena being interconnected logically so, that 
if we have found by observation a certain number of them, we can deduce 
the others by pure reasoning without making any fresh observations. 
Thus if we find by experiment that the coefficient of magnetisation of a 
piece of soft iron decreases as its temperature increases, we can predict 
that a magnet will become heated when it is moved from weak to strong 
regions of the magnetic field. 

We are thus led up to the most important problem of Natural 
Philosophy: What is the minimum set of observational data which is 
sufficient to form a basis for the whole edifice of physical theory? Certain 
eminent living physicists now give to this question the amazing answer: 
None at all! That is to say, they repudiate altogether the principle 
enunciated by Aristotle, St Thomas, and Newton, and return to the 
attitude which was maintained for so long with regard to geometry, and 
which they now propose to extend to the whole of physical science. 

In order to appreciate their standpoint, so unexpected and paradoxical 
as it must at first seem, let us consider one particular branch of physics, 
namely, the subject of electricity and magnetism. The early experiments 
and deductions in frictional electricity by Gilbert, Gray, Du Fay, Watson, 
Franklin, and Cavendish were completed by the measurement of the 
attraction between two electrically charged bodies, and its representation 
by the law generally known by the name of Coulomb, but actually first 
discovered by Priestley. In magnetism the recognition of magnetic poles 
by Petrus Peregrinus in the thirteenth century led eventually to the 
determination by Michell of the law of force between them. In the 
first half of the nineteenth century Oersted found that an electric current 
generates a magnetic field, Ampere showed that a ponderomotive force is. 
exerted between. two circuits carrying electric currents, and Faraday 
discovered the induction of currents. These researches, spread over six 
centuries aiid pursued incessantly during the last two of them, were 
experimental, and each of the physical properties concerned rested 



i6j 


in the Philosophy of the Physical Sciences 

independently on its own observational basis. The various results were 
finally combined and consummated in the general electromagnetic theory 
of Maxwell, which comprehended all known electric and magnetic 
phenomena. 

Before Maxwell, the story is one of performing experiments and 
devising formulae to represent the results. But the post-Maxwellian 
period is wholly different in character. Maxwell’s synthesis went beyond 
the experiments and predicted the existence of something that had never 
as yet been observed, namely, electromagnetic waves—what are now 
called wireless waves—and the correctness of this inference was verified 
after the death of Maxwell by the celebrated researches of Hertz. It 
may be said generally that experimental investigations in electricity from 
1870 to 1900, such, for instance, as Rowland’s work on fields produced by 
the motion of electric charges, were directly inspired by Maxwell’s theory, 
and that the mathematical equations were capable of furnishing the 
results of the experiments beforehand. 

The change in the method of discovery after Maxwell may be illus¬ 
trated by a simple analogy. Suppose that a map of Scotland is pasted 
on stiff cardboard and then cut up into small irregular pieces, so that 
it can be used as a jigsaw puzzle. Anyone who tries to solve the puzzle 
does not at first know what is represented, and his only possibility of 
procedure’ is to find pieces which fit into each other and so constitute 
larger parts of the -whole. After a time, however, he will have progressed 
sufficiently to be able to guess that what is represented is Scotland, and 
from that time onwards he completes the work not by finding pieces which 
fit into each other, but by using his a priori knowledge of Scotland to 
put every fragment into its proper place. These two methods may be 
likened to the two types of research in physical science: the earlier, 
proceeding step by step by experiment in special topics; and the later, 
knowing a priori what ought to be, because a guiding principle is now 
available for the whole, permitting the extension of knowledge by purely 
rational methods. When the work of fitting together the puzzle is 
completed, the picture alone remains as significant-—that is to say, in 
physics, the mathematical theory. 

The logical unity achieved in electrical science by Maxwell had this 
consequence; that the vast majority of the experiments, on which it had 
originally been built up, were now superfluous, and the number of ultimate 
independent observational/acts necessary for its establishment was very 
small. Eventually they were reduced to only one, namely, that when a 
hollow metallic vessel is electrified there is no electric field in the air 
inside. From this single datum, combined with a priori principles such 



i68 


E. T. Whittaker , Some Disputed Questions 

as the axiom of relativity, it is possible to derive mathematically, first, 
the inverse-square law of force between electric charges at rest, then (by 
considering these charges in motion relative to an observer) to deduce 
the existence of magnetic force, and finally to obtain in Maxwell’s form 
the general equations of the electromagnetic field. 

The single experimental fact on which electric and magnetic science 
can be founded may be put in the form, “It is impossible to set up an 
electric field in any region of space by enclosing the space in a hollow 
conductor of any shape or size and charging the outside of the conductor” ; 
and in this form it bears a certain family likeness to other statements on 
which important branches of physics have been based, such as the 
postulate of thermodynamics (from which a great part of physical 
chemistry is derived), “It is impossible to derive mechanical effect 
from any portion of matter by cooling it below the temperature of the 
coldest of the surrounding objects”; or the ppstulate of Relativity, “It 
is impossible to detect a uniform translatory motion, which is possessed 
by a system as a whole, by observations of phenomena taking place 
wholly within the system”; or the postulate (which plays an important 
part in the explanation of homopolar bonds in chemistry) that “It is 
impossible at any instant to assert that a particular electron is identical 
with some particular electron which had been observed at an earlier 
instant”; or the postulate of Imperfect Definition in quantum mechanics, 
“It is impossible to measure precisely the momentum of a particle at the 
same time as a precise measurement of its position is made.” Each of 
these statements, which I propose to call Postulates of Impotence , asserts 
the impossibility of achieving something, even though there may be an 
infinite number of ways of trying to achieve it. A postulate of impotence 
is not the direct result of an experiment, or of any finite number of experi¬ 
ments ; it does not mention any measurement, or any numerical relation 
or analytical equation; it is the assertion of a conviction of the mind, 
that all attempts to do a certain thing, however made, are bound to fail. 
We must therefore distinguish a postulate of impotence, on the one hand, 
from an experimental fact; and we must also distinguish it, on the other 
hand, from the statements of Pure Mathematics, which do not depend 
in any way on experience, but are necessitated by the structure of the 
human mind; such a statement, for instance, as “It is impossible to find 
any power of two which is divisible by three.” We cannot conceive any 
universe in which this statement would be untrue, whereas we can quite 
readily imagine a universe in which any physical postulate of impotence 
would be untrue. 

It seems possible that while physics must continue to progress by 



in the Philosophy of the Physical Sciences 169 

building on experiments, any branch of it which is in a highly developed 
state may be exhibited as a set of logical deductions from postulates of 
impotence, as has already happened to thermodynamics. We may 
therefore conjecturally look forward to a time in the future when a 
treatise on any branch of physics could, if so desired, be written in the 
same style as Euclid’s Elements of Geometry , beginning with some a 
priori principles, namely, postulates of impotence, and then deriving 
everything else from them by syllogistic reasoning. 

With our minds thus prepared, let us approach the doctrines that 
Sir Arthur Eddington and Professor E. A. Milne have lately given to 
the world. 

Milne begins by regarding the universe as an aggregate, of which 
one member is our galactic system of stars, the other members being the 
extra-galactic nebulae, each of which is actually a vast system of stars. 
He represents this aggregate in an idealised fashion as a collection of 
particles, each particle corresponding to one nebula; and each particle 
is supposed to carry an observer, equipped with a clock and a theodolite 
and with apparatus for sending and receiving light-signals. Milne then 
postulates that this ideal universe satisfies what he calls the cosmological 
principle , namely, that the map of the world which any particle-observer 
constructs from his own observations is identical with the map which any 
other particle-observer constructs from his own observations—that is, 
each sees the same sequence of view T s of the life-history of the universe. 

The cosmological principle evidently bears a fairly close resemblance 
to the axiom of relativity, and may, like it, be expressed in the form of 
a postulate of impotence. Thus what Milne does essentially is to assume 
a postulate of impotence and then deduce its logical consequences, much 
as the science of thermodynamics is deduced from the Second Law; or, 
to go further back, much in the same way as Euclid deduced the whole 
of geometry from his common notions and postulates. Indeed, Milne 
describes his work as an attempt to do for systems of particles in motion 
what the Greeks did for static assemblages of points. He arrives at a 
detailed account of the life-history of the universe, and derives the laws 
of dynamics, the inverse-square law of gravitation, and the equations of 
electromagnetism. 

The work of Sir Arthur Eddington is, on the philosophical side, 
more extensive and much more difficult to characterise than that of 
Professor Milne. In t;wo books, published in 1936 and 1939 respectively,, 
he has set forth systematically what is offered as a complete philosophy 
of physical science. 

The origin of his distinctive ideas may, I think, be traced to the 



170 E . T. Whittaker , Some Disputed Questions 

circumstance that in the upbuilding both of relativity and of quantum 
mechanics the question, “What is it that we really observe?” has played 
a great part. The critical examination arising out of this question showed 
that the older physicists had been accustomed to talk about things that 
are inaccessible to observation, whose existence and meaning are therefore 
highly doubtful. It came to be recognised, for instance, that no signifi¬ 
cance can be attached to the term “absolute velocity in space,” and that 
it is impossible to imagine an experiment which could follow continuously 
the motion of an electron in an atom. These advances towards precision 
of thought, and the remarkable consequences which followed from them, 
made on Eddington a profound impression, and his aim in recent years 
has been to extend them, by making a thoroughgoing study of the nature 
and limitations of our knowledge of the physical sciences. 

His attitude may be indicated by an analogy. Suppose a sea- 
fisherman invariably finds on examining his catch that no fish in it is 
less than two inches long. He might be inclined to assert as a fundamental 
law of zoology that all fishes are at least two inches in length, did he not 
know the real reason for the observed fact, namely, that he was fishing 
with a net of two-inch mesh. In interpreting the analogy, we may com¬ 
pare the net to the methods of research employed in physics, and the 
two-inch minimum for fishes to the laws physicists have discovered, 
which have in the past been regarded as describing properties of the 
real world, but which may be merely the necessary consequences of the 
procedure used in obtaining them. Hence Eddington's insistence on the 
importance of a proper investigation of the sensory equipment with 
which we observe, and the mental equipment by aid of which we formulate 
the results of observation. From this scrutiny he derives certain conclu¬ 
sions which he calls epistemological principles —principles relating to the 
method or ground of knowledge—and which, h§ asserts, are adequate to 
supersede entirely all observation and experiment as the basis of physical 
science. % 

This claim is so astounding that it must be stated in his own words. 
“I believe,” he says,* “that th£ whole system of fundamental hypotheses 
[of physics] can be replaced by epistemological principles. Or to put it 
equivalently, all the laws of nature that are usually classed as fundamental 
can be foreseen wholly from epistemological considerations.” “This 
means,” he adds, f “that the fundamental laws and constants of physics 
are wholly subjective, being the mark of the observer's sensory and 
intellectual equipment on the knowledge obtained through such equip- 

* The Philosophy of Physical Science , 1939, p. 56. 

f Ibid., p. 104. 



in the Philosophy of the Physical Sciences 171 

ment, for we could not have this kind of a priori knowledge of laws 
governing an objective universe.” “ An intelligence,” he says in another 
place,* “ unacquainted with our universe, but acquainted with the system 
of thought by which the human mind interprets to itself the content of its 
sensory experience, should be able to attain all the knowledge of physics 
that we have attained by experiment. He would not deduce the particular 
events and objects of our experience, but he would deduce the generalisa¬ 
tions we have based on them. For example, he would infer the existence 
and properties of sodium, but not the dimensions of the earth.” 

The doctrine thus proclaimed is so entirely contrary to all the received 
ideas of the nature of science, that it will certainly not be accepted 
without a careful examination. In the first place, we may remark that 
Eddington’s epistemological principles cannot be absolutely antecedent 
to all experience of the external world, for without some such experience 
it would be impossible to attach any meaning to the language employed 
(this indeed he admits). In the second place, we may observe that the 
notion of an “intuitive truth” has a sense that is relative rather than 
absolute. Take, for example, the statement that the sum of 1 and 5 is 
equal to the sum of 2 and 4. We are accustomed to speak of this as a 
proposition necessitated by pure logic, and indeed, to those who have 
mastered Whitehead and Russell’s Principia Mathematica, so it is; but 
it is not an intuitive truth to a savage who cannot count beyond four, 
and there is no doubt that it came to be accepted only as the result of a 
long development of culture, in which the operation was frequently 
performed of adjoining one object to five objects and observing that the 
same collection could be obtained by adjoining two objects to four objects. 
That is to say, a statement which we believe from inescapable necessity 
was originally a conviction arrived at from countless experiments. We 
must therefore envisage the possibility that the postulates of impotence 
and the Eddingtonian principles, to which most physicists, if they would 
assent to them at all, would assent on the ground of observation, may, 
by a more highly cultured or more penetrating mind, be perceived as 
self-luminous and irresistible truths, for which all proof would be super¬ 
fluous. In Indian philosophy, I understand, there is*a doctrine which 
asserts the possibility of an ineffable union with the infinite intelligence; 
but one hesitates to put forward, as the solution of the difficulties attending 
the epistemological assumptions, the suggestion that some such condition 
is already enjoyed by their distinguished author. 

In order to examine more closely the validity of the thesis before us, 
let us consider in detail one of its principles, and the physical results 
* Relativity Theory of Protons and Electrons , 1936, p. 327. 



ij 2 E. T. Whittaker , Some Disputed Questions 

which Eddington claims to have deduced from it. An assumption which 
he specially mentions* is the 4 'special relativity principle.” This, and 
some of his other epistemological axioms, happen to be closely connected 
with some of what I called "postulates of impotence”; but the concept 
of a postulate of impotence is quite different from Eddington’s concept 
of an epistemological principle, since the postulates of impotence are 
generalistions from experiment, which epistemological principles expressly 
are not; and also because some of the latter are not assertions of inability 
to do anything. 

Take, then, the special relativity principle, and a physical result which 
Eddington offers as a deduction from it, namely, the law of increase of 
mass with velocity. The story of this law is a curious one. As everybody 
knows, in Newtonian dynamics every particle of matter has a definite 
mass or measure of inertia, which is invariable—it does not change with 
the situation, temperature, or velocity of the particle. It was, however,, 
shown long ago by J. J. Thomson and Lorentz, from electromagnetic 
theory, that when a body is electrified its inertia when in motion must be 
greater than when it is uncharged, and, moreover, that the additional 
mass thus created must increase when the body’s velocity increases. An 
expression for the increase of the mass of an electron with its velocity was 
found mathematically by Lorentz, and was confirmed in the laboratory by 
experiments on the deflection of high-speed electrons in an electric and 
magnetic field. In the theoretical derivation, it was assumed that the 
mass of the electron was entirely electromagnetic, and the observational 
verification of the formula was naturally taken to be a confirmation of 
.this assumption. It was therefore with amazement that physicists learnt, 
soon after the discovery of the theory of relativity, that Lorentz’s formula 
had really nothing to do with the electromagnetic theory by which it had 
been obtained, and that it was true not merely for electrons, but for all 
bodies whether electrified or not—it is in fact simply an immediate 
consequence of relativist dynamics. 

Now let us hear Eddington on the subject. "As an example,” he 
says,f "we may take the law of increase of mass with velocity, which has 
been the subject df many famous experiments. It is now realised that 
this law automatically results from the engrained form of thought which 
separates the fourfold order of events into a threefold order of space and 
an order of time. When knowledge is formulated in a frame which 
compels us to separate a time dimension from the fourfold order to which 
it belongs, a component called the mass is correspondingly separated 

* The Philosophy of Physical Science ; p. 39. 

t Ibid*, p. 116. 



173 


in the Philosophy of the Physical Sciences 

from the fourfold vector to which it belongs: and it requires no very 
profound study of the conditions of separation to see how the separated 
component is related to the rest of the vector which prescribes the velocity. 
It is this relation which is rediscovered when we determine experimentally 
the change of mass -with velocity.” 

This account of the matter ignores everything in the deduction of the 
mass-velocity formula except one particular step, namely, the separation 
of the temporal component from a vector in space-time. But, before this 
separation can be effected, we must know all about the vector—that is, 
we must already be in possession of relativist dynamics. Now relativist 
dynamics is based in part on the Axiom of Relativity, which may perhaps 
be called an epistemological principle (though actually it was arrived at 
only as an inference from an enormous number of experiments); but the 
Axiom of Relativity taken alone is certainly not sufficient to yield all that 
is wanted. There is not, so far as I know, any treatment of relativity- 
theory which does not in some way, directly or indirectly, make use of 
the assumption that the formulae of relativist dynamics reduce to the 
formulae of Newtonian dynamics when the velocities concerned are small. 
Indeed, without this assumption it does not seem possible to set up the 
desired connexion between the mathematical equations, on the one hand, 
and phenomena in the external world, on the other. But Newtonian 
dynamics was undoubtedly founded on experiment, and it is difficult 
to see that Eddington has provided any other derivation. 

The perplexity generated by his account of the laws of nature is not 
lessened by studying his work on the fundamental constants. “Not only 
the laws of nature,” he says,* “but the constants of nature can be deduced 
from epistemological considerations, so that we can have a priori know¬ 
ledge of them.” He actually obtains four constants, namely, the number 
of particles in the universe, the ratio of the electric to the gravitational force 
between a proton and an electron, the ratio of the mass of the proton to 
the mass of the electron, and what is called the fine-structure constant, 
which is of great importance in atomic physics. The last three of these 
were already known experimentally, but the first, the number of particles 
in the universe, which he calls the cosmical number , is hardly susceptible 
of observational verification; it is derived by epistemological reasoning of 
the purest kind, being in fact equated to the number of independent 
quadruple wave-functions, which is 2 x 136 x 2 256 . Naturally the question 
arises as to what exactly is meant by the “number of particles in the 
universe.” At the time when the theory was originally propounded it 
was believed that all atoms are composed of two kinds of elementary 
* The Philosophy of Physical Science, p. 58. 



i 7 4 A. T - Whittaker, Some Disputed Questions 

particles, namely, protons and electrons; and Eddington defined the 
cosmical number to be simply the number of protons plus the number ol 
electrons in the world. Since then the situation has become more 
complicated: a fresh elementary particle, the neutron, was found experi¬ 
mentally in 1932; another, the positron, theoretically in 1930 and experi¬ 
mentally in 1933; another, the neutrino, theoretically in 1931; and yet 
another, the meson, theoretically in 1935 and observationally in 1037. bi 
the light of these discoveries, some re-interpretation of the cosmical 
number was evidently necessary. Eddington has decided * that a neutron 
shotild count as two particles, and a positron as minus one, so that the 
creation and annihilation of electrons and positrons in pairs will not 
affect the total. What is to be done about mesons and neutrinos is not 
yet settled. It must be said that readjustments of this kind are too 
suggestive of patching-up to be altogether satisfying. However, their 
author’s confidence has in no way diminished, and he now claims t that 
the cosmical number determines the speed of recession of the distant 
nebulae and the range of action of the forces between the particles in an 
atomic nucleus; and that all these things, being accounted for epistemo¬ 
logically, are ipso facto subjective—they are demolished as part of the 
objective wo rid. J 

From the Greek nature-philosophers to Eddington the wheel has 
come full circle. Each of them believed, for example 1 , that the value of 
the sum of the angles of a triangle is a necessary consequence of the 
constitution of the human mind. But they do not. agree on whuf the 
value is. The Greeks said it was two right angles, while Eddington says 
that for triangles of astronomical size it is greater than two right tingles 
by an amount depending on the size of the triangle. 

A word may be said in conclusion on the question as to whether 
Eddington’s general attitude may be subsumed under any of the historical 
philosophical schools. The two philosophers to whom he acknowledges 
himself indebted are Kant § and Bertrand Russell || —an oddly-assorted 
'pair* since Russell has devoted much of his activity to confuting Kant. 
What Eddington admires in the philosopher of Kdnigsbcrg is the k priori 
doctrine regarding space, to which his own principles bear a certain 
resemblance. This affiliation is not likely to commend the new episte¬ 
mology to mathematicians, who still remember with bitterness the harm 

* The Philosophy of Physical Science, p. 170. 

+ Ibid., p. 177. 

t Did., p. 59. 

§ Ibid., p. 188. 

|| Ibid., p. 151. 



in the Philosophy of the Physical Sciences 


175 


that was done in the early days of non-Euclidean geometry by Kantian 
opposition; and, perhaps partly on this account, Eddingtonian views have 
not hitherto made many converts. The admiration which is universally 
accorded to the genius of their author is assuredly not diminished by his 
stupendous attempt to base the study of the external world on a new 
foundation: but on the work itself the verdict as yet is not proven. 


{Issued separately January 26, 1942,) 



176 


D. N. Lawley 


XIV.— Further Investigations in Factor Estimation. By D. N. 
Lawley, B.A., Moray House, University of Edinburgh. Com¬ 
municated by Professor GODFREY H. THOMSON. 

(MS. received October 21, 1941. Read December 1, 1941.) 

I. In a previous article (Lawley, 1940) the present writer applied 
the method of maximum likelihood to the problem of estimating the 
loadings of a set of tests in a number of factors. The assumption there 
made was that the observed test scores and also the factors involved 
were normally distributed over the population of persons tested. This 
assumption has been criticised by Gale Young (1941), who has pointed 
out that nothing need be implied concerning the distributions of the 
observed scores over a population of individuals, - but that it is only 
necessary to make assumptions concerning the error distributions. It 
may, however, be remarked that in other branches of statistical inquiry 
it is a usual procedure to assume that the observed variates are normally 
distributed throughout the population, and that even when this assumption 
is not perfectly fulfilled the practical effects are generally not serious 
provided that the departures from normality are not too large. In fact, 
as we shall indicate, the method of estimation previously given does in 
practice lead to results which are in accord with those obtained by other 
methods commonly in use. It is of course true that the estimation of 
the factor loadings of the tests administered is only half the problem, 
and that the ultimate object is to estimate the amounts of the factors 
possessed by the various individuals tested. Nevertheless, once the 
factor loadings of the tests have been found, the individual factor measure¬ 
ments may then be estimated either from the equations given by Bartlett 
(1937) or from those of Thomson (1936). 

In his solution of the factor problem Gale Young makes, for practical 
reasons, what is equivalent to the supposition that the error variance on 
a hypothetical infinity of trials is independent of the individual but 
depends only on the test. He assumes, however, that the error variances 
. have been previously determined from other data. In the present paper 
we shall show that this last assumption is not strictly necessary since, 
given the scores of a group of individuals on a set of tests, it is in theory 
possible to estimate simultaneously not only the factor measurements 



Further Investigations in Factor Estimation 177 


of the individuals and the factor loadings of the tests but also the error 
variances. 

2. Let us suppose that n tests are given to a group of N individuals, 
and let us make the hypothesis that the scores obtained by them depend 
on the presence of m common factors. Then, if x ia is the score of the 
a th individual in the i th test, we shall assume that - 

m 

X ia sss y, Q^ri^ra) “h ( I ) 

r-l 


where X ri is the loading of the i th test and <f> m the measure of the a th 
individual in the r th factor. 

It will be* further assumed that the component e ia is due to random 
error and that, for a given value of i, e ia is distributed normally with a 
variance of which is independent of the value of a.* We may also, with¬ 
out loss of generality, suppose that e ia) and <f> ra are measured from 
the corresponding sample means, so that 

a=l a»X 

for all values of i ) 

2 (^ a )=0 

a ”1 

for all values of r. 

The joint distribution of the set of errors {e ia } then takes the form 


LUide^ 

i, a 

where L is the likelihood function and is equal to 


1 

• • • o*- 1 * 



Using equation (1) we find that 

log, L = - (N -1) 2 , log, art -§]£ (“S X ~ X } 

i i V.OV a L r J J 


+ a constant. 


Efficient„ estimates of the unknown parameters {of}, {A r< }, and {</> ra } 
may as usual be found by choosing values of these quantities* which 
make L a maximum. In order to do this we differentiate log 6 L with 
respect to each of the parameters in turn and equate the results to zero. 
Denoting the above estimates by {$?}, {l ri }, and {f ra } respectively (so as 

* We may, if desired, suppose e% a to include a specific factor, provided that such 
specifics are assumed to be normally distributed over the population of individuals 
tested. - ■ 



D. N. Law ley 


178 


to make a distinction between population parameters and their sample 
estimates), we are thus led to the equations 


s\ = 


(N -1) a 


21 x i a ~]£VrtfJ 


0 '=b 2, . . . n), 


2 . 2. (Itiifmf -a) ~ 2. faigf rn) (z — 1, 2 , . . • W lilld f - 1, 2, 

a g- a 

\ 


(2) 

«*). (3) 


drtljij gg ) ~ 2 t ( ) (® —I, 2, . . . N ail(l 2"— J, 2, . . . *w). (4) 

These equations are not all independent, however, since the result 
of multiplying both sides of (3) by — 2, . . . m) and summing 

over i is the same as that obtained by multiplying both sides of (4) by 
fpa and summing over a. To obtain a unique solution we shall therefore 
impose m 2 additional conditions. Firstly, the scales of measurement 
for the m factors are entirely arbitrary, so that we may for convenience 
choose them so as to make 

2 </lb m n-x, 

a 

for all values of r. We may further suppose that the factors are such as 
to satisfy the “orthogonality” conditions 


2 , (Aafra) 
a 


for all values of q ) r such that q^r. 

Equation (2) can now be simplified, using also (3), and may be 
expressed in the form 

= Qflfj 2 ^ ” 2 


If we let {a u } denote the set of sample variances and covariances, so that 

1 < 

a U “ 2 ( X ia X ja)> 

then this equation becomes, even more simply, 

' (5) 

r 

Equations (3) and (4) may also be simplified by using matrix notation. 
Let X, L, F, and A denote the matrices whose typical elements are 
hu fra, and a i} respectively; let V be the diagonal matrix whose 



Further Investigations in Factor Estimation 


179 


elements are the error variances jf, si, ... and let J be the diagonal 
matrix LV^ 1 L' (it will be diagonal in consequence of the last condition 
which we imposed above). Then equation (3) may now be written 


while equation (4) becomes 


or 


(N-x) 


JF = LV- 1 X, 
F=J~ 1 LV- I X. 


FX', ( 6 > 


<7> 

Finally, by eliminating F from equations (6) and (7) and using the fact 
that A = m 1 — XX', we find that 


"(N -1) 


JL = LV" 1 A. 


( 8 > 


3. It may be noted that the above equations give results which are 
independent of the units of measurement used, so that we may equally 
well use the correlation matrix R, with unit diagonal elements, in place 
of the covariance matrix A. Equation (7) is identical (except for the 
difference of notation) with that obtained by Bartlett ( loc . cit.) for esti¬ 
mating a person’s factors given the test loadings; the latter may in 
theory be found from equation (8), It does not, however, seem easy to* 
find a satisfactory method of solving (8), since all the more obvious 
iterative procedures either do not in general converge, or else tend to- 
unacceptable solutions in which one or more of the error variances vanish. 

As an example we may consider the case where 


A * 

I -000 

•374 

*56 0 

■335 


•374 

I -000 

•219 

•454 


•560 

•219 

1*000 

•449 


_ ‘335 

. -454 

•449 

1-ooo 


If the presence of two general factors is assumed, it may be verified that 
equation (8) is satisfied (roughly) by 


L-r 

•815 

•608 

•794 

• 72 S] 

L 

“•354 

■586 

-•371 

■ 477 j 

V«[ 

• 2 X 0 

OO 

•232 

• 247 ] 


(the elements of V being for convenience printed in a row instead of 
diagonally); but no method has so far been found for arriving at this 
solution by successive approximations. It has been found possible to 
apply a method of this kind to the simple example given below, but this* 
example is special in that the second and third tests have identical cor- 

P.R.S.E. —VOL, LXI, A, I94I-42, PART II ‘ . 12 



i8o 


D . N. Lawley 


relations with the other two, which is apparently the only reason why 
the method works. 

If we have 


A~ 


1*0 


*4 


'4 

1*0 


•4 

*7 


*2 


'3 


*4 ‘7 *3 


L ' 2 '3 ‘3 i*oJ, 

and if we suppose the existence of one general factor, then the matrix of 
loadings L will consist simply of one row, and will satisfy the equation 


JiL = LV“ X A, 

where 

h - LV^L' (and h* - LV^AV- 1 !/). 

As a first approximation to L we shall take the loadings given by 
Hotelling's first “principal component,” namely 

Lx = [-662 -857 -857 -540]. 

If we let Vi denote the corresponding first approximation to V, then a 
second approximation to L is now given by 




La“ 

jLjY^A, 

«1 



where 


h {~ 

1 (L X V x l A)(V 



The actual calculations are 

set out below. 



Lx 

*662 

*857 

•857 

■540 


V x 

•56x8 

•2656 

*2656 

•7084 


LxV x - 1 

1-1784 

3-2267 

3*2267 

■7623 


LiV j 1 A 

3-9122 

6-1854 

6*1854 

2-9340 


^1=46-7636 

*lh m 

• *14623 

L 2 -5721 -9045 

•9045 -4290. 

The above process is then repeated until the required degree of approxi¬ 
mation is reached, the successive approximations to L being as follows:— 

•572* -9045 

•9°45 

•4290 

•5058 -9177 

•9177 

■3772 

•4845 -9198 

•9198 

■3634 

■4801 -9201 

*9201 

•3607 

•4793 -9202 

*9202 

•3602 

*4791 *9202 

•9202 

•3601. 

Thus we see that, correct to th'ree figures, the test loadings are given by 

k=J/479 *920 

*920 

•360], 

and the error variances by 



V«[*77i *153 

■153 

•870]. 



Further Investigations in Factor Estimation 


181 


4. The above method of estimation will in future be referred to as 
“method II,” in order to distinguish it from the previous method (Lawley, 
loc. cit.) which we shall call “method I.” 

If we continue to use the notation so far adopted, and if we allow a\ 
to represent not only the variance due to error but also the variance due 
to a possible factor specific to the i th test, then the equation of estimation 
for method I may be written 

L = LC -l A, (9) 

where C = L'Lh-V. 

Now by using the identity 

(I + J)LC -1 =LV -1 , 

it is easily seen that equation (9) is equivalent to 

JL = LV -1 (A - V); (10) 

and J is as before a diagonal matrix. 

This equation is the same as (8), the corresponding equation for 
method II, except for the substitution of (A-V) for A; but whereas 
we were unable to find a general iterative method for solving (8) we have 
been able to derive one for solving (10). The method given below may 
be regarded as superseding that previously suggested, which suffered 
from the disadvantage that it necessitated the calculation of the reciprocal 
of the correlation matrix, a somewhat laborious undertaking when the 
number of tests administered is large. 

Suppose for convenience that just two general factors are assumed 
present, and let the two rows of L be denoted by L„ and L ff . Let L Ml 
L w , and V x be first approximations to L„, L e , and V. Then second 
approximations are given by 

Ljja = X (A - Vj) 



Lj2 “ rL a iVf J (A - L' pl L p2 - Vi) 

= ^{L ?1 Vr 1 A - (L 4l V 1 _1 L^ 2 )L J)2 - L s d 
K 1 




D. N. Lawley 


182 


where 

*;-w lp i» ^=w i q;. 

This process, repeated indefinitely, appears to converge in all cases. 

When only one factor is to be estimated the calculations involved 
are extremely simple. Thus in the example given at the end of the 
preceding section, if we take as our approximation to L the loadings 
given by Spearman’s method of estimating g, namely, 

L x == [-496 -823 -823 -374], 

then the first stage of the calculation is as follows:— 


Lx 

•496 

•823 

•823 

•374 

Vx 

•7540 

•3227 

■3227 

•8601 

LiVf 1 

•6578 

2-5504 

2-5504 

•4348 

Lx Vf 1 A 

2-7851 

4-7292 

4-7292 

2*0966 

Px 

2*2891 

3-9062 

3*9062 

1*7226 


^-22-1795 

1/^1 

«=*2I234 


L g 

*4861 

•8294 

*8294 

•3658. 


Continuing the above process, we find that the approximations for L 
converge to 

[■4807 -8361 -8361 '36231. 

Hence (as also obtained by the previous method of calculation), the 
loadings of the four tests are given, correct to three figures, by 

L = ['48x -836 -836 -362], 

and the error (or specific) variances by 

V=[-769 -301 *301 -869]. 

5. It may be of interest to compare methods I and II with two other 
methods of factor estimation, namely, Hotelling’s process of extracting 
“principal components” and the modification of it by Thomson (1934). 

In Hotelling’s process, if we suppose the tests to have been standardised ■ 
(sio that A is identical with the correlation matrix R), the matrix of 
loadings L satisfies the equation 


where 


KL = LA, 
K = LL'; 


(u) 


and in view of the fact that the factors satisfy an orthogonality condition,. 
K is a diagonal matrix. 

Comparing equations (8) and (11) we see that, formally, method II 
differs from Hotelling’s only in having an extra weighting factor V -1 , 



Further Investigations in Factor Estimation 183 

though, as the example given at the end of section 3 shows, the differences 
between the results of the two processes may in practice be considerable. 
Identical results will be obtained only if the proportion of error variance 
to total variance is the same for each test, i.e. if the tests are all equally 
reliable. Hotelling’s process is in fact given by the method of maximum 
likelihood when it is assumed from the start that this fact is so. 

Thomson has stated his process only for the case where one general 
factor g is to be estimated, but it seems clear that this process can be 
generalised to estimate several factors at once, the equation of estimation 
being in this case 

KL-L(A-V), (12) 

where again K = LL/ is a diagonal matrix. The actual calculations 
could be carried out by means similar to those employed for method I, 
the only difference being the omission of the weighting factor V* 1 . 

Thus method I differs from the generalisation of Thomson’s process 
in that it gives greater weight to those tests whose error variances are 
proportionately small, and conversely. Both of these methods are such 
that when N, the number of individuals tested, tends to infinity the 
expected values of the correlation coefficients, as calculated from the 
factor loadings, tend to the corresponding population values; whereas 
in the other two methods’considered this is not the case. It appears, 
however, that the results of methods I and II will tend to equality when 
n, the number of tests administered, becomes large compared with m, 
the number of factors to be estimated. 

For comparison, the results obtained by all four methods in the 
example already considered are summarised below: 


Method. Loadings Obtained. 


Method I 

•481 

■836 

•836 

•362 

Thomson 

•486 

■833 

•833 

*368 

Method II 

•479 

*920 

•920 

•360 

Hotelling 

•662 

•857 

•857 

•540 


To sum up, it would seem that method I, used in conjunction with 
the equation * 

F ~ LC^X = (I + J)~ X LV~ X X (13) 

for estimating an individual’s factor measurements, has distinct advan¬ 
tages over other methods of estimation. The amount of calculation 

* This equation is that given by Thomson’s regression method of estimating a 
person’s factors. The factors estimated by Bartlett’s method would in this case, however, 
differ only in scale from those found by the regression method. 



184 


D. N. Law ley 


involved, using the new process put forward in this article, is not excessive, 
even when more than one factor is to be estimated. Furthermore, this 
method of estimation is almost the only one known to the writer for 
which a satisfactory test has been rigorously established for deciding 
whether a sufficient number of factors have been fitted. It may therefore 
be as well to restate this test, which is applicable when N is large, using 
the notation of the present article. 

Let us make the hypothesis that there are exactly m general factors 
present, and let 



where as before {r?} represents the set of error (or specific) variances, 
and where 




r=1 


Then, under the above hypothesis, w is distributed as x 2 with p degrees of 
freedom, where 

p - $»(« - x) - mn + \m{m ~1) 

“*|{(« - m) 2 - n - «}. 


If a significantly high value of w is obtained, this will indicate that we 
must reject the hypothesis and assume the existence of more than m 
factors, 

6 . It may of course happen that we are provided with independent 
estimates of the error variances and covariances (it is not then necessary 
to assume that the errors are uncorrelated). Thus, for instance, the 
n tests may all have been given more than once, in parallel forms, to 
each of the N individuals. In that case it is possible to carry out an 
analysis of variance and covariance and obtain two independent sets of 
estimated variances and covariances, one set being due to differences 
between persons while the other is due only to error. 

A Let these two sets be denoted, by {a (i } and {b u } respectively, and let 
A and E5 be the matrices of which a ti and b i} are the typical elements. 

Then the equation of estimation for the factor loadings will in this case 
be 


HL-LB~*A ? 

where 


w H = LB“ 1 L' 

is a diagonal matrix. 

In the solution of this equation the rows of L 
latent vectors of B-*A, and the elements of H 


(14) 


are proportional to the 
are the corresponding 




Further Investigations in Factor Estimation 185 

latent roots. The individual factor measurements are given in terms 
of the test scores by the equation 

F - H“* 1 LB“ 1 X = LA-*X. (15) 

This method of procedure represents nothing new, however, since it 
is equivalent to the “discriminant function” analysis put forward by 
Fisher (1938, 1939). 

Summary, 

7, A method of factor estimation is given in which assumptions 
are made only about the form of the error distributions of the tests 
administered. This is compared with a method previously suggested in 
which, on the contrary, the test scores and the individual factor measure¬ 
ments were assumed to be normally distributed over the population of 
individuals tested. A comparison is also made with other processes at 
present in use. 


REFERENCES TO LITERATURE, 

Bartlett, M, S. f 1937. “The Statistical Conception of' Mental Factors,” 
Brit , Journ . Psychol vol. xxviii, pp. 97-104, 

Fxshbr, R. A,, 1938. “The Statistical Utilization of Multiple Measurements,” 
Ann. Eugen ,, vol viii, pp. 376-386, 

——, 1939. “The Sampling Distributions of Some Statistics obtained from 
Non-linear Equations,” Ann . Eugen., vol. ix, pp. 238-249. 

Lawley, D, N,, 1940. “The Estimation of Factor Loadings by the Method 
of Maximum Likelihood,” Proc . Roy . Soc. Edin vol. lx, pp. 64-82. 

Thomson, G. H., 1934. “Hotelling’s Method modified to give Spearman’s g” 
Journ . Educ . Psychol vol. xxv, pp. 366-374. 

-, 1936. “ Some Points of Mathematical Technique in the Factorial Analysis 

of Ability,” Journ. Educ. Psychol ,, vol. xxvii, pp. 37-54* 

Young, Gale, 1941, “ Maximum Likelihood Estimation and Factor Analysis,” 
Psychometrika , vol. vi, pp, 49-53. 


(Issued separately January 26, 1942,) 



A. C. Aitken and H. Silvers tone 


186 


XV.— On the Estimation of Statistical Parameters. By A. C. 

Aitken, D.Sc., F.R.S., Mathematical Institute, University of 
Edinburgh, and H. Silverstone, Ph.D., New Zealand. 

(MS. received December 30, 1941. Read February 2, 1942.) 

i. Introductory. 

THE. present paper communicates some of the results given in a thesis 
submitted by Mr Silverstone to the University of Edinburgh in May 1939. 
The starting-point, the adoption of the postulates of unbiased estimate 
and minimum sampling variance, and the analogies which would emerge 
with the theories of maximum likelihood and of linear estimation by 
Least Squares, were suggested by the senior author. Thanks are due 
to Dr R. P. Gillespie for some helpful preliminary discussion in 1937 and 
1938. 

The senior author is also responsible for the composition of this paper 
and for modifications of the original material. Owing to the hazards 
and delays of correspondence arising from the present: war, it has not been 
possible for the authors, one living in Scotland and the other in New 
Zealand, to continue their collaboration. 

2. Maximum Probability Density and Minimum Variance. 

In the theory of linear estimation by Least Squares from linear obser¬ 
vational equations affected by error, the normal equations for the optimal 
values of the unknowns can be deduced by two quite different sets of 
postulates. The first, (A), proceeds by (i) assuming a normal distribution 
of errors of observation, and (ii) accepting as optimal values of the un¬ 
knowns those which make the compound probability density a maximum; 
the second, (B), proceeds by (i) assuming that the optimal values are 
unbiased linear combinations of the observations, and (ii) accepting 
those particular linear combinations for which the compound variance 
of error is a minimum. It is to be noted that in (B) the law of error of 
the observations is not restricted to be of normal or of any assigned type, 
arid could indeed be different for each observation. 

In the theory of estimation of statistical parameters the principle of 
maximum likelihood (Fisher, 1921) bears a certain resemblance to postulate 



On the Estimation of Statistical Parameters 


187 


(ii) of the first set (A) of postulates of Least Squares. The principle 
accepts, in fact, as best estimate of an unknown parameter d in a prob¬ 
ability function <j>(x; 0) of assigned type, that function t(x l9 x 2) . . #*) 

of the sample values Xj which makes the compound probability density 
of the Xj a maximum. The vindication of the principle is largely a 
posteriori , arising from the circumstance that if the estimating function 
t has a sampling distribution of normal type, then, within the class of 
unbiased analytic functions estimating 0, t has also the least possible 
variance. Even when the distribution of t is not normal, it tends more 
and more to become so as the size of sample is increased, and the property 
of minimum variance tends to hold with closer and closer approximation. 

This,, analogy with the position in Least Squares suggests a basis for 
estimation alternative to that of maximum likelihood and closely 
resembling the postulates (B). Let us adopt postulates (i) of unbiased 
or consistent estimate, (ii) of minimum variance of estimating function. 
The investigation of the consequences of these postulates sets the principle 
of maximum likelihood in an interesting light. For example, simple 
conditions emerge under which maximum likelihood provides an estimate 
accurately possessing minimum variance, even though the sample is 
finite and the distribution of the estimating function is not normal. At 
the same time the actual value of the minimum variance is obtained with 
special ease. 

The arbitrariness of the postulates needs no emphasis. They are 
inevitably arbitrary; but so are those of any proposed alternative set. 
The restriction to a linear operation in the condition of consistency is 
admittedly a severe one. One is free to choose a condition of consistency 
different from the one here adopted, but the subsequent analysis will be 
found decidedly less tractable, perhaps impossible. Again, since, as is 
well known, many plausibly posed questions in the Calculus of Variations 
do not admit of analytic solution, it evokes no surprise that wide classes 
of probability functions are not amenable to estimation of parameters by 
the suggested postulates. 

3. Postulates for Estimation of Parameters. 

Let <j>(x; 0 ) be a probability function containing an unknown para¬ 
meter 0, which is to be estimated from a sample x^ x z , . . x n of n 
values of x. The range of x is presumed to be independent of 6 ; and <f> is 
presumed to be uniformly differentiable with respect to 0 . Let us denote 
the compound probability density of the sample by 

< 5 >(x ls x 2) . . x „; 0) s®. 



i88 


A. C. Aitken and H. Silverstone 


It is required to estimate 0 by a function t(x lt x 2 , . ■ x n ) s t, where t 
is independent of 0 . Let us adopt the two following postulates 

(j) That t shall be an unbiased or consistent estimate of 0 , in the sense 
that the mean value or expectation of t over all n -ary samples x v x 2 , . . ,,x n 
shall be 0; that is, rrf 

111. . . t(bdx x dx 2 . . . dx n — 6. (i) 

(ii) That the sampling variance of t over all such samples shall be a 
minimum; that is, 


Ilf 


{t - dy^dx^x^ . . . dx n = minimum. 

We have also the necessary condition of total probability, namely, 


(2) 


(iii) 



. ^>dx 1 dx 2 . . . dx n — x. 


( 3 ) 


The problem is thus a minimal problem in the Calculus of Variations, 
of positive definite type and formally simple. Let us write the Euler 
equation for its solution in the form 






where cD# 




( 4 ) 


the Lagrange function A( 0 ) being independent of x lt x t) 
then have 

' ^ ' 

’ =0+A (a0 l6g<3> 


Xn 


We 


(S) 


where t must be independent of 0 . In short, an estimating function t 
exists, provided that the derivative of log <D with respect to 0 can be 
resolved as follows: a 


30 


log$ = (/- 0 )/A( 0 ). 


( 6 ) 


Two standard examples will serve to illustrate 
• Example I „—To estimate ju, in <j>(x ; /*)'■ = (in) ~ i exp - %(x ~ ju)*. 
Here ' 

d ' 

— log<I> = 2a:,-«p, 

= (/-p,)/A(ju) 


provided that A(p.) — i/n, in which case t—^Xjjn = 51 , Thus the estimate 
of [i is x, the mean of the observations. As for A, let us anticipate the 
general result by noting that its value here, i/n, is the minimum variance 
in question, namely, the sampling variance of Z. 



On the Estimation of Statistical Para?neters 


189 


Example 2,—To estimate a 2 in ( 27 r<r 2 )~£ exp -|# 2 /a 2 . 
Here 



n 1 Hxf 
2CT 2 2 a 4 


— {t - cr 2 )/A(cr 2 ) 

provided that A = 2cr 4 /«, in which case t—^xfjn. Again, 2a 4 /^ is the 
variance of the estimate ^ of <r 2 . 

If we had had to estimate both /x and a 2 in 

<j)(x; /x, o* 2 ) s (271a 2 ) exp { - |(x - fx) 2 /<r 2 } 

we should have proceeded by first estimating /x. The estimate proves 
as before to be the mean x = with variance A = <P\n. This estimate 

being accepted, we know from standard results that the compound 
probability density relative to the estimate x y not to the unknown /x, is 

<&(*!,»!> • • a 2 ) = <r(<T s )~" 15 exp { - 


From this, estimating anew for cr 2 and writing the customary s* instead 
of t, we find 

- a 5 ) 2 /(« - i) and A * zcr 4 /(« -1), 

the latter being the well-known sampling variance .of r 2 . The sampling 
distribution of s‘ z is known to be not normal, but of Gamma Type. 

As a further point of interest, let us try to estimate, not a 2 but a in 

cfi(x ; a) ••= (2rrtr 2 ) " 1 exp - |a- 2 /cr 2 . 

We obtain 

( ~ o log = - «/<r + 

Evidently this cannot be put in the form (t - a)/A(a), where t is independent 
of a, and A is independent of the Xj. In this example, therefore, we may 
not estimate for <t, but must estimate for a 1 . The reason is that the 
property of minimum variance is not preserved under non-linear trans¬ 
formations of the parameter. It follows that in most cases we must 
estimate, not 9 but some function r( 0 ). What function r is will appear 
in the sequel. 

Example 3.—The Cauchy distribution 

4 ,(x; /x)={i+(x- l u) 2 }/v. 

Let us try to estimate ju, the abscissa of symmetry. We have 
— log G> - 2 2 (*# - /*)/{ 1 + (*< “ /*)*}• 



* 9 ° „ A. C. Aitken and //. Silverstone 

This cannot be put into the form it - 0 )/A( 0 ). The Cauchy distribution is 
thus intractable by the present method. 

Example 4.— 

4 >(x; 0) ={T(k)}-'e~W~'e- xl « 

e 

^ lOg + 

= (/- 0 )/A( 0 ) 

provided that t=^x i /nk, and A = d^jnk. 

Here let us observe that the sample is finite and that the distribution 
of t is non-normal. 


4. Comparison with Maximum Likelihood. 

It appears, then, that an estimating function r( 0 ) exists when 
8 log <t >/50 resolves itself into the form 

8 8r(6) 

8Q °£ ® £0 x m • • x n) -'>'(0)}/A(0). (x) 

We estimate t(0 ) by t, and deduce the estimate of 0. Thus our procedure, 
when possible, consists in virtually equating t(x u ;r 2) . . Xn ) to r( 0 )| 
indeed in formally, solving the equation in 0 , 

^logO-o. (a) 

Now this is precisely what is done in R. A. Fisher’s method (Fisher, 
.. 1921) of maximum likelihood, for log 0 ) is the logarithm of the compound 
probability density of the sample; but in that case it is done without 
consideration of the existence or nature of A( 0 ), a question which does not 
arise. rom our standpoint, on the other hand, the existence of A( 0 ) is 
,un amental; for when A( 0 ) exists, the estimate by maximum likelihood 

,° that 1S ’ or the triviaI lin ear function ar( 6 )+b, but not of non- 
lnear unctions of r( 0 )) has also minimum variance, even in the case of 
a finite sample. 


5. Variance of Estimate. 

The minimal variance t> of r( 6 ), that is the sampling variance of the 
timate, can be expressed in several different ways. To avoid prolixity, 
let us suppose that the suitable parameter has been ascertained, and is 
.henceforth called 6 . We have then 


v JjJ * • • (* - 0) 2< I><&; 1 dxi ... dx n 


oo 



where 

Hence 

Now 

whence 


On the Estimation of Statistical Parameters 


AW^log®. 


. . . dx x dx% . , . alx n . 


c ) 2 v w 

3 a $ / 3 2 \ 

(W=^-felog4 




(2) 


Hence, since the integrals of O e and <J> M , the range being independent of 
6 , are both zero, we have 

®- - A 2 JJJ . . . (^2 log <S>y>dx 1 dx s . . . dx n . (,3> 

But again, since 


we have 


since 


^ log 0 = (/ ■- 0>/A(0), ~ log 4> = (/ - ^)~{A(3)}-1 - {A(0)}-\ 

» w - A a JJJ • .'. (- X~ i ^>)dx l dx i . . . dx n , 

||| . . . (i ~ d) 0 dx 1 dx 2 . . . dx n - a. 


( 4 > 


Hence v — \ while equally we see from (3) that v~ x is the mean value of 
- d*Ljdd 2 , where L = log 4 >. This last is a well-known result of R. A. 
Fisher, which here we see holding accurately in finite samples, provided 
that A( 0 ) exists, and that dL/ 8 d=(t- 0 )jX ( 0 ). 

It is easy to prove from v = A that the only Pearsonian distribution for 
which the mean is a sufficient statistic for locating the curve is the normal 
distribution. For since the variance of the mean of any distribution is 
a 2 /», we must have 

;~l0g <D-«(0-2#/«)/cr S 

whence g 

20 log<£= -O-0)/o 2 

and so 


<f> = c(x, a) exp { - |(x - 0 ) 2 /cr 2 }. 



19 2 


A. C. Aitken and H. Silverstone 


There is an arbitrary step here, namely, the choice of the arbitrary 
function c , which may involve x and a but not 0, but a census of the 
Pearsonian curves will show that no other Pearsonian type except the 
normal will satisfy the problem. A probability function of Type A would, 
however, be admissible. 


6 . Probability Functions Amenable to the Postulates. 

By integrating the fundamental equation of estimation 

1 log <E> = (7 - 8 )jX( 8 ) (i) 

we may ascertain what class of probability function is amenable to the 
present method. Since t is independent of 8 , and A is independent of the 
Xj, we have 

log <t> =ix( 8 )(i - 8 )+ v( 8 ) + C, ( 2 ) 

where 



and where C is a function of the x, but not of 8 . Thus <^(x; 8 ) must he 
of the form 

8 )^exp{fji( 8 )i(x)+F( 8 )+/(x)} ( 4 ) 

and must also satisfy the condition of total probability, § 3 (3). This 
class of functions includes the normal function and functions of Pearson’s 
Type III. 

By means of the above result we are able to determine what function 
t( 0 ) should be estimated, if necessary, instead of 8 . For if 


then 
and so 

Hence if 

we arrive at 


</>= exp (jxt + F+f) 


<D= exp 

8 , . OF 




t — — dF/dfji, = 


OF/djx 
' 88 / 88 ’ 


_ log <!> = (^tjn -t)/A(t). 


( 5 ) 


(6) 



On the Estimation of Statistical Parameters 193 


Thus we estimate for r( 0 ), as given by -dF/dp, and our estimate is 
indeed "]jjt/n, its variance being {n 8 jj.jdr)~ l . 

Example .— 

< j>(x ; o-) = ( 27 Tcr 2 ) _ i exp — ix s /a s . 

Here 

fl=-l/2CT 2 , i J ’=-]0g£7, 

whence 

8 F_ 8 F a 

8/i 80/ 8a ° ’ 

/ fin, 

which we must therefore estimate. Also v — •: n~^ I —2o 4 ‘/n. 

Or again, purely in terms of ft and F, we might have written 


v ~\ 


0ft 0ft /0T 

n dr** n dl! 86 


whence 


0ft //0 2 A 0ft\ 

M del \ v w 


^ since r = 


0 jP\ 

0ft/ 




- n 


»F 

0ft 2 ' 


7. Sufficient Statistics and Minimum Variance. 

In a paper, "On distributions admitting a sufficient statistic,” B. O. 
Koopman finds the general form of a distribution function (Koopman, 
1936) admitting the determination of sufficient statistics for the estimation 
of some or all of the parameters 8 j. 

The intuitive notion of sufficiency is this, that a statistic should use up 
the whole of the relevant information contained in a sample. The formula¬ 
tion of this idea with respect to p(x; 6 ) leads to a sufficient statistic 
estimating 6 , provided that the equality 

^*(^ 1 , 3 * 2 , ■ • • > )* 6 ) _ # 2 , • . ., Xn ) B ) 

, X 2 , ■ * * 3 > 0 ) ®(*1 > **"2 j • • • > 10 ) 

is implied by t{xf xf .... xf)=t(xt it x 2 , . . x„) for any two values 

0 and 6 ' of the parameter and any two possible samples, as indicated. 

Koopman finds that, if <j> is analytic and non-zero over some con¬ 
tinuous range of 0 , then 

<i>{x ; 6) = exp {FfBtffx) + F 2 (6) +ffx)}, 

where the functions in the exponent are real, single-valued, analytic 
functions of their arguments. 



194 On the Estimation of Statistical Parameters 

Now this is the form which we obtained in § 6 for probability functions 
amenable to the postulates of unbiased estimate and minimum variance. 

We see, therefore, that these functions also admit sufficient statistics for 
the estimation of the parameters concerned. 

8 . Simultaneous Estimation of Parameters. 

Problems of so-called simultaneous estimation lead to the following 
equations, in k parameters: 

d 

t,(*V *2, • ■ •> *«) = Qj - K • • •> log j = I, 2, . . k, 

where 4, 4, . . 4 are functions of x lt x 2 , . . ., x k independent of 

&i, $ z , . . 6 k and A b A a , . . ., X, c are functions of B t , 0 a , . . ., 0 k 
independent of x lt x 2 , . . x n . The strictly simultaneous solution of 
these equations is usually difficult. It is, indeed, extremely rare to find 
all the parameters admitting equations from which the factor A ( 6 ) is 
extricable. As we have seen, even the estimation of p. and a 2 from a 
normal sample involves us in difficulty, unless we agree to estimate, first 
[i by x, and then a 2 from the corresponding modified probability density, 
relative to x, not to /a. The estimation of the means /x 10 and /a 01 of x and 
y, the variances /x 2 0 and /x 02 and the correlation coefficient p from a bivariate 
normal sample of n paired values (x h y } ) offers still greater difficulties. 

It is proposed to reserve the subject of simultaneous estimation and 
other aspects of the method of minimum variance for future consideration. 

Summary. 

In the problem of estimating from sample the value of a parameter 
in a probability function new postulates are suggested of unbiased 
linear estimate and minimum sampling variance. A comparison is 
made, with illustrative examples, between this method and the principle 
of maximum likelihood, and ground common to the two is traversed. 
The new postulates are also placed in relation to the theory of sufficient 
statistics. 

REFERENCES TO LITERATURE. 

Fisher, R. A., 1921. “On the inathematical foundations of theoretical 

statistics,” Phil. Trans., A, vol. ccxxii, pp. 309-368. 

Koopman, B. O., 1936. “On distributions admitting a sufficient statistic,” 

Trans. Amer. Math. Soc., vol.'xxxix, pp. 399-409. 


(.Issued separately April 2, 1942.) 



INSTRUCTIONS TO AUTHORS ( continued ) 

For fine pencil sketches, and line and wash sketches, the whitest base 
available should be used. The lightest tones or lines should be a little 
stronger than is desired in the reproduction, and the heavier tones strength¬ 
ened in proportion, so that in reproduction the whole scale of tones may 
be adjusted to bring the base into its proper relationship. Otherwise a 
very light pencil line or a weak wash will not reproduce as a stronger 
tone than the base, but will merge with it and be lost. Fine pencil lines 
should be uniform throughout. 

Drawings and photographic prints should be kept as clean as possible. 

When the illustrations are to form plates, a scheme for the arrange¬ 
ment of the figures in Roy. 4to plates for the Transactions , and in Super 
Roy. 8vo for the Proceedings , must be given, and the figures numbered 
(and, when necessary, lettered). 

Space available for figures and lettering in Transactions , 9J" x 7J". 

» » » „ Proceedings, x 4|". 

A proof of each paper will be sent to the author, whose address should 
be given on the MS. The cost of corrections made in the proof should 
not exceed 5 per cent, of the printers’ charges for the setting of each 
paper. The author may be charged with any excess. 

The proof must, if possible, be returned within one week, addressed 
to The Secretary, Royal Society of Edinburgh, 22 George Street, Edin~ 
burgh, 2, and not to the printer. To prevent delay, an author residing 
abroad should appoint someone residing in this country to correct the 
proof. 

As soon as a Transactions paper, or the sheet in which the last part 
of a Proceedings paper appears, is ready for press, copies or reprints, in 
covers, bearing the title of the paper and the name of the author, are 
printed off and placed on sale. The date of such separate publication is 
printed on each paper. 

Authors of papers will receive fifty reprints free, if they so desire; but 
in view of the strict necessity for economy in paper and printing, their 
attention is directed to the desirability of requesting only twenty-five 
gratis copies, where this number will meet their needs. Authors may 
have a reasonable number of additional reprints at a fixed scale of prices, 
which will be furnished by the printer, who will charge them with the 
cost. To prevent disappointment, especially if the paper contains plates, 
the author should, as soon as possible, notify to the Secretary the number 
of additional reprints required. 

To facilitate the compilation of indexes, and to secure that due atten¬ 
tion to the important points in a paper shall be given in General Cata¬ 
logues of Scientific Literature and in Abstracts published by periodicals, 
each author is requested to return to the Secretary, together with the 
proof of his paper, a brief index (on the model given below) of the 
points in it which he considers new or important. These indexes will 
be edited by the Secretary, and incorporated in separate index slips, issued 
with each part of the Transactions and Proceedings. 

' r "' " L '' rLUltT " *-***-H-MT-.*, «*MM» p 

MODEL INDEX. 

Sch&fer, E, A.—On the Existence within the Liver Cells of Channels which scan he Erectly 
. injected from the Blood-vessels. Proc. Roy. Soc. Edin., vol. XXIV, 1902, pp. 65*69. 
Cells, Liver,—‘Intra-cellnlar Canaliculi in. 

E. A. Sohtfer* Proc. Roy. Soc. Edin., vol. XtXV, 190a, pp. 65,69, 

Liver.—-Infection within Cells of. 

E. A. ScSfer. Proc. Roy. Soc. Edin,, roL XXIV, 1904, pp. 65-69. 


CONTENTS 


f'tv:' 

Elation of Statistical Parameters. By A. C. 
AiTkeN; 0 .Sc., / F.R.S.,.Mathematical Institute, Univerr : 
'f*'-• sity of Edinburgh,, and ,H,; Silverstone, Ph.D., New 
, Zealand . . ./ ■ • • ■. 18& 

{Issued separately April a, 1942.) 










The Central Limit Theorem 


log 


VIII.—The Central Limit Theorem for a Convergent Non- 
homogeneous Finite Markov Chain.* By J. L. Mott, 

Department of Mathematics. University of Edinburgh. Com - 
municated by Professor A. C. AlTKEN, F.R.S. 

(MS. received January 3, 1958. Revised MS, received April 24, 1958. 

Read May $, 1958) 


Synopsis 

The distribution of x ni the number of occurrences of a given one of k possible states 
of a non-homogeneous Markov chain {Pj} in n successive trials, is considered. It is 
shown that if P n -^P t a positive-regular stochastic matrix, as n -»• 00 then the distribution 
about its mean of x n ]nk tends to normality, and that the variance tends to that of the 
corresponding distribution associated with the homogeneous chain {P}. 


i. Introduction 

We .say that a non-homogoneous finite Markov chain is convergent if the 
sequence { 7 *J of matrices of one-step transition probabilities of the chain 
tends to a limit matrix P as j-+ 00, and if P is positive^regular so that 
lim P n - U exists and the elements of U are all positive. We know that 

n~>os> 

if this is so the product P X P% • • ■ P*. =P <n) also tends to U as u -*■ co 
(Mott 1957, p. 379). 

Wc can interpret this result in a second way. Let x n be the number 
of occurrences in n successive trials of a given one of the k states; we can 
take this state to be the first without loss of generality. Then x n has a 
certain probability distribution whose mean x n is given by 

x n = 2 {P™+P™ + . . .+P< n >}[i,%- ■ 

where the row vector a denotes the initial probability distribution and 
[1, • •, *y denotes the transpose of the row vector [x, Thus 

if /><»)-► U, as is the case above, then x n -> na.U[ I, . ., ■]', so that the 
mean frequency of occurrences of the given state in the non-homogeneous 
chain of trials tends to asymptotic equality with the mean frequency of 
occurrence in the corresponding homogeneous chain. 

* This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 

P.R.S.E.—VOL. LXV, A, 1958-59, PART 


8 



no 


J. L. Mott , The Central Limit 'Theorem 


We can consider likewise the higher moments of the distribution of x. tn 
or of xjn*. We now show that all moments of the distribution of x n /n* in 
a sequence of n successive trials of the convergent non-homogeneous chain 
tend to asymptotic equality to the corresponding moments for the associated 
homogeneous chain. Since the distribution of x n /n* in the latter case 
tends to normality as n oo (Frechet 1938, p. 160) it follows that the 
distribution of x n jn * also tends to normality in the case of a convergent 
non-homogeneous chain, and that the variances of two such associated 
distributions are asymptotically equal. 


2, The Moments of the Distribution of x n 

The nature of the proof of these results in the general case of a chain 
with k states is exemplified by the case of k *=3, and for clarity we shall 
set out the details of the proof for this particular case. We write yt^ for 
the rth moment, about the current mean, of the distribution of x n . We 
show that the difference between the values of y^jn for the non- 
homogeneous convergent and associated homogeneous chains tends to 
zero as n 00, and then that 


|4m = (2W-l)(2W-3) . . . + 0(» w ~ l ) (l) 

and 

• /4»Ln = 0(« m ). ( 2 ) 


The result that the distribution of x n /n J is asymptotically normal then 
follows. 


We consider the increments to the moments on passing from one stage 
to the next and hence obtain the moments of the distribution after n trials 
by the summation of n such increments. Let the initial probabilities of 
the occurrence of the three states be given by the elements of the row vector 
[a, ft, y], and denote a typical matrix P } of the chain by P^-liPab] for 
a, b = i, 2, 3. Then the characteristic function (c.f.) of the distribution of 


x n is 


[ae u , ft, y]e“*»'M II 


~ Ai f<< Pvt Pvt 

s 


**r 

Ai*** Aa As 



1 

_Ai tf<i As A8_ 





if, as always, we refer the distribution to its current mean x„ as origin. 

To bring into prominence the moments of the distribution and to 
utilize , the particular nature of stochastic matrices, namely, that their row 



for a Convergent Non-homogeneous Finite Markov Chain I r X 


sums are unity, it is convenient to make a certain matrix transformation: * 
we replace each matrix ( ) in the c.f. by H( where 



I • 

- X I 

and so H - 1 ~ 

1 • 

1 1 


_— I • I__ 


— 1 * 


Under this transformation a typical matrix occurring in the c.f. becomes 


;' iPl l e%t + #12 + #13> #12 9 

Fj ( t ) S * (#21 “ #ll)^ + (#22 #12) + Up 23 ”■ #13)3 #22 “ #12> 

_CsE^Sl iPld e%t + (#32 #12) + (#33 “ #1.3)J #32 #12? 

and the c.f. of the distribution of x n is 


ft rl^" 1 


Thus 


n^) ^[x, 1, i]'. 

j-i 


v), 


+^+y»ftr] | n | [1, •, •]' 

•]' 


#13 

#23 ” #13 > 

#33 " #13 _ 


say, where & w =[a w , j 3 n) y n ]. We can expand the exponential terms as 
series in ascending powers of (ft) and so write <f> n (t) in the form 

f 1 + • •., v'r 1 + 4 »>(*y)+v'">— 2 +..., 

L 21 21 

We have used the same notation in a n as for the moments; this is because 
the effect of the post-multiplication of a n by the column vector [i, *, •]' is to 
give <f> n (t) *» a n , and this fact accounts also for the absence of a term 
p!i\it) which does not occur since the distribution is referred to its current 
mean as origin. 

For finding the increments to the moments we are led to the study of 
the corresponding change in on transition from the yth to the (j + i)th 
stage, and so to the study of the change in a jt This necessitates in turn 
the study of c^, for the vectors g, and a M are related by the recurrence 
relation 

S« - e~ Ai (3) 

where A } is the increment to the mean on passing from the jth to the 
(J + x)th stage. 

* I am indebted to Professor A. C. Aitj^ iwM ta&^u^gestion. 



m 


/. /,. Mott, The Cent ml Limit Theorem 


3. Tin? Variance of the Distkiuhtion of .>•„ 

For conciseness wc write 

iK^iPab-iPn for « K '- 2)3 im<1 A-1,2, 
and we then have on comparison of the elements of the first column of (3) 
that 

- A 0 ) + 4W/2 1 ~ • • * H' 1 + l'm/u + *+A» 1 IM 

+ + i+iPu + 2 J'kAjVi^ + -1 ,1 A, ef/ 1 -l-;, A, to}i J, J(/V) a /i I T • • • 

11=1 + F#+x/’ll + I+Ai+ #+Al w &^ ~A]\Qt) t {| n\P t ;n/’ll + 2/f.Ai ,,( /* 

+ 2 ftAx w l^+HAi l 'o^ +y+Ai w {/*J " 2 ^j\l i-i At "tin'll >'}/* *'/i Ai w */*J 
+ J|}(2V) 8 / 2 1 + . . . 

The coefficient of (2V) here is necessarily zero so that 

A “ j+xAi + y+Ai*4 jJ + i+Ai^lm (4) 

Using this result we have that A$\ the increment to p!/ 1 011 transition from 
the/th to the (/ + i)th stage, is given by 

Afj.tf ' 1 + 2 M $ n v[® + ay+Ax^i^ + Aj ~A’j. (3) 

We now need to find expressions for e ( ” ! and w 1 /* 1 , and therefore we 
compare the second, and the third, columns of the vector recurrence 
relation (3). From the second column we find 

Pm “t 1 , KG-hA* + i+:AavfP + itAs^i/ 5 ) t O+As v i ,) 

"O+xAj +^+l522» / ^ ) +j +1 S 3 j i 40^ i ) + [(/+A**4^ + #+ ,8 32 CV ( /1) " AOflAs ’W lAs 

+’J+A2W^ ) )](</)+• * ‘I 

from which we have 

+ ” m Pn + m$m v $ + j-nP-vMP ( 6 ) 

and 

“ j+xSas^ + J+Aa*^ ~ (j+x Pn + i+i$22 l '}/ ) + 

+i+AM j) ~ vtf+^Aj. (7) 

From the third columns we find similarly 

e$ + «i+i pn + vjP+ j+x8 3S w^ 

and 

33 4 fl - 4 y+1) 4 . 

We. can express these results concisely by the use of a vector notation. 
Write 

< • Pi ^ liPn> iPiiHi ^,= [v^, v[fi <m [v| J \ w^] 



for a Convergent Non-homogeneous Finite Markov Chain 113 


and 

3§22j j'§23 

Dj = . 

3 ‘^ 33 _ 

Then we have 

4 i+1) =A+i+4 i) A+i and v[ i+1) =v^I > j+1 -vf + 1 ) A s . (8) 

By the use of (5), (4) and (8) we can express Aff, and hence ff, in terms 
of the initial probabilities and the D } , and so in terms of the initial 
probabilities and the elements of the P s of the chain. 

4. The Convergence of the Variance 

We wish to show that the value of ffjn f° r the chain in which P n ->-P 
tends, as n~r 00, to that for the homogeneous chain with P as its matrix 
of transition probabilities. It is sufficient to show that the corresponding 
d/ 4 *) tend to coincidence, and this follows if we show that v^\ and A„ 
tend to coincidence with the corresponding vf\ vjf 1 and A n for the homo¬ 
geneous chain. We consider first v^\ 

By the use of (8) we have that 

“A +A- iP* +Pn-‘l.£ > n-tE > n + • • • +AAA • • • > (9) 

where p 0 - [jS, y]. If we define p and D similarly for the associated homo¬ 
geneous chain we have likewise in that case 

4” 1 =p +pD ■‘rpD 2 + . . . + pP n ~ 1 + p 0 JD n . (10) 

We wish to compare (9) and (10) and for this need the following lemma, 
which we state and prove with reference to the general case of a chain 
with k states. We use the notation | B | < b to mean that each element 
of the matrix B does not exceed b in modulus, and P^ to denote a product 
of n matrices P } . 

LEMMA .—If I Pj -P I < e for all j then | P w -P n \ < nke. 

Proof. —The result clearly holds for n- 1. Write P } -P +JBj, and 
assume that the result is true for n—m. Then, for somey, 

I y>[m+l] _ pm+1 | _ | _ pm+l J 

= | (P + E } )P™-P.P m \ 

= \P(pM-P m )+£ jP M\ 

< J P(P M -P m ) l + l EfP™ | 

< mke + ke 



114 J. L. Mott, The Central Limit Theorem 

since P has non-negative elements and unit row sums. It follows by 
induction that the result is true for all values of n, since we certainly have 
from the inequality above that | P [m+1] -P m+1 | < (in + i)ke. 

We now consider the difference between the expressions as given by 
(9) and (10) for in the non-homogeneous and associated homogeneous 
cases. We shall once again treat the general case of a chain with k states; 
in this case expressions formally equivalent to those of (9) and (10) occur, 
the only difference being that the elements of each D jt which is now of 
order k -1, arise by subtraction of the first row of the corresponding P. s 
from the remaining k - l rows. 

We can write the difference of the as 

{(Pn ~P) + (pn-\L) n —pD')-\-. . ■ + ( A»—• • • Bn ~ plA' l )} 

+{(A~j 1 A• • • - D n -p£> 3l ) +. . .+(p Q & 1 Z) !S . . . D n ~p„D n )} 
=A+B, 

say, there being j\ pairs of terms in A and n -j\ +1 pairs of terms in B. 

Consider B first. We suppose to begin with that P is itself positive so 
that we have j P \ > 2 e for some e > o. Then since P. } —► P as j -*• 00 we 
have that | Pj -P | < e for j sufficiently large, j > j 0 say, so that | P } \ > e 
for j > j 0 . Let and D [r] now denote the products of r matrices P jt D } 

respectively for all of which j >j 0 , and let d ^ 1 -c < 1. We define the 
range pj of a finite stochastic matrix 1 by 

Wax \iPaft -IPa'Bh 

a,a',/3 

It follows that | D } | < p } . Then if f, denotes the least element of P, we 
have, for j >j 0 , that 1 < d so the range of does not exceed d r 

(Mott 1957, p. 372) and hence | X> w | < d r . Also, since 1-2 e < d, we 
have that \D r \< d r . Further, the range of a product P a P^ of finite 
stochastic matrices does not exceed the range of P i} for each element of a 
given column of P a P h is a weighted mean of'the elements of the corre¬ 
sponding column of P b so that the greatest element of a column of P a P t 
can not exceed, nor the least element be less than, the greatest and least 
elements respectively of the corresponding column of P b , Thus 
I (Pi,-a • • • A,) (Pi,+i • - • T>m) | < d m ~>\ where o < a < /„ - 1 and 
m > Jo + !• Now take n +1 >y 0 and note the result that \pD | < d 
and the analogous results for a product of D s , and we have that 

| B | < 2 {d h + d h+1 +.. . . + d n ~ io + j b d n ~ h ) 

< i{j b d h + d h !(i - d)}. 

Thus, given any e > 0, we can choose j\ such that \B\<\ e . 




for a Convergent Non-homogeneous Finite Markov Chain 115 


The value of n is still arbitrary except that n >/ 0 +j x -1, and we now 
show that by a suitable choice of n we can ensure that likewise | A | < 

To do this we use the lemma. Given any e' > o we have that \P i -P\<e' 
for/ sufficiently large, / > n t say. If now P [r] and D [r] denote the products 
of r matrices P h D$ respectively for each of which / > n x we have from the 
lemma that | P w -P r | < rkC and hence that | D^ r] -D r | < 2 rkC, so that 

| p r D^ -p& |«| (pj -p)DW +p(JDW - J)r) | 

< I (A -p)F> [r] |+1 p(P [r} - jy) | 

< kd 4 2 rkd 

< 2,'rkC. 

Thus if we take n -/1 +1 > n x (as we can consistently with the condition 
on n above) we have that 

Ml < e' + 3&#i(/i-1)/2 

< 3^'/ 1 (/ 1 + 1 )/ 2 . 

If we now choose e'- {3^/xC/i + we have that \A | c fe and so 

\ A 4 B | c <r. It follows that, on the supposition that P is positive, the 
in the non-homogeneous and associated homogeneous cases tend to 
coincidence as n co. 

In the general case of P positive-regular P itself may be not positive. 
However, P n is positive for sufficiently large values of n and we have 
| P n | > 2 e for some e > o if n > n % say. Also \ P n ~P \ < e/kn 2 for n 
sufficiently large, n > n % say, so that if P w now denotes the product of n 
matrices P for each of which/ > n 4 - max {n 2) n%} we have from the lemma 
that | P w -P®* | < e so that | P CWfl] | > e. It follows for a product D [rn * ] of 
rn % matrices D$ for each of which / < that [ | < d r . We can now 

proceed to consider B much as in the preceding particular case; we treat 
groups of n 2 terms throughout and use the result that | £>^+*3 | < | | # 

Thus we can take j x =(r + %)n 2 and n such that n -/1 4 1 > and have that 

\JB | < 2 kn 2 {d 4 * d % 4 . * .} 4 2 k{n^ 4 n 2 )d tls , 
where is the integral part of (n Thus 

\B | < 2 k{n^ 4 n 2 )d n * 4 2kn 2 dj{i - d ). 

Thus for/x and n sufficiently large we have | B | < |€ for any given € > o. 
We can now show that by the choice of n sufficiently large we can ensure 
that | A | < 4 *; this we do by making the changes in the previous proof 
for A corresponding to those we have just made for B. We omit the 
details. 



1x6 /. Z. JZ?#, The Central Limit Theorem 

Thus the expressions (9) and (10) for z/ ( 0 w) in the non-homogeneous and 
associated homogeneous cases tend to coincidence as 00. It follows 
from (4) that the two corresponding expressions for A n also tend to 
coincidence; but this we know already since P (n) P n . 

Now consider the expression for v^ l) as given by (8). For the moment 
let us write v s for vf and % for With this notation (8) becomes 

V M “ 1 - Uj 

so that 

5=8 • * • L) n ~~ UiD 2 > • • D n ~ U%D'& . * • — . . . — 

We have a corresponding expression for the z/ n of the associated homo¬ 
geneous chain, and have to show that these two expressions tend to equality 
as n 00. But since Z> x . . . Z> w tends to zero as n co and the 74 for 
the non-homogeneous chain is already known to tend to the u n for the 
associated homogeneous chain we have essentially the same problem as 
that treated above for v^\ and the result follows in the same way. 

Finally we have from (5), since P n -+ P as n~+ 00, that the two 
expressions for Aju^ tend to coincidence. It follows that the variances of 
the probability distributions of x n /n* in the non-homogeneous and associated 
homogeneous cases tend to coincidence as n -+ co. 

The assumption that P is positive-regular ensures that this variance is 
non-zero (Fr6chet 1938, p. 88); this is essential to the following proof of 
normality* 

5. Asymptotic Relations between the Higher Moments 

[5.1] We now prove (1) and (2) for the convergent non-homogeneous 
chain. For ease of printing we shall often omit a suffix b in reference to 
a Ab and to the elements of a p » or of a Z> & , and likewise omit the superfix b 
in reference to a vf\ The occurrence of these suffixes and 

superfixes has been illustrated throughout § 3, which also provides an 
example, for a particular case of moments of low order, of the methods of 
the present general case. The proof is by induction. (1) is certainly true 
for m »1 and is an immediate consequence of 

(11) 

for r=m> provided that (u) is true for all r < m -1. To prove (n) we 
return to the recurrence relation (3). We show below (in § 5.2) that if (11) 
is true for all r < m -1 then the orders of v* n) and u>^ are not higher than 
that of n for all r < m — 1; for the moment we assume this result and 



for a Convergent Non-honiogeneous Finite Markov Cham 1 17 

so have from (3) that /is, apart from terms <9(^ w ~ 2 ), the coefficient of 
(it) 2m l(2m)\ in 

{1 -A(it)+A*(?:t) 2 l2\ + . . . -2)1 + l>2?n-l 

+ (2 m — §21^2w-2 ^31^2^1-2)]( 2 ^) 2w 1 K 2m — l) i + [/^2m m-l 

+ 8 al v 2m _ 1 + S 31 0) 2m _ 1 ) + ?k( 2* - l)( Al/*2m-2 + 8 21 v 2m-2 + S3l"2m-2)K*X> 2 “/( 2OT ) 

Then, using (4) in the form ^ -p lx =8 21 v 0 +8si w o> we ^ ave that 

A /X 2m = 2«[(l' a „,_i - V 0 ^. 2m _x)S 21 + (w 2m _i - O>0^2»n-l)§3l] 

+ mifim - xJIAwt-S Al + v 2m-2^21 + ~ / x 2m-2^“] 

+ 2tn(2tn - l)[(r 2 , u _2 - V 0 ^ 2 m _ 2 )S 2 l + ( aJ 2»i-2 ~ a, of t 2m-2)§:tl] + )• 

On the introduction of A /x 2 by the use of (5) we find that the latter equation 
is equivalent to 

2*»{[(*W-1 - VotHm-d ~ ( 2m - I )> / l/^m-2]82X + [("2m-X " w 0^2m-l) 

- (2m - x)a) 1 fx 2m _ 2 ]S 81 } + m(im - x){(v 2m _ 2 - v 0 /a 2 „,_ 2 )8 21 + (v 2w _ 2 -cv 0 fi 2m _2)8 31 } 

= 0(«™- 2 ). 

This is so if 

V W_ and ( I2 ) 

(v'lx - vlT>/4tx) -(«' - = ^- a ) 

and 

(a4"L 1 - wfrVa”-1) “O'" 2 = ^(» r ' 2 ) 



for r < w +1. , 

[5.2] We have from (9), or more readily from (10) to which (9) tends as 

n -> 00, that viT 1 -C(i). To show this we have merely to note that, with 

the notation of § 4, | D n | < d n ‘ for n > ni so that 

|*#>| <n i +n i + n i (d+d* + . . .) < « 4 + «*/(* -d)= 0 (i). 


It follows from (8) in a similar way that = 0 (i) also. . 

These particular results provide a basis for an inductive proof of the 

general result that 

.2=^.2 = v£Lx=4*i-i“ W 1 ) 


41 


Jf (») = J») for all r < w - x, that is, if (n) and (2) hold for all 

r < tn - v For from (3) v£ +1) is then the coefficient of (it) iT l(ir)\ in 
{i}[M2r(^) 2 V( 2? ')!, v 2r (itf r !{2r)\, w^(il) ir K2r)l][p 12l B^,B n ]' + 0(n r - 1 ), 

and likewise for o& +1) , V&? and «&?. This is the result that we antici¬ 
pated above in § 5.1. 



J. L. Mott , The Central Limit 'Theorem 


[5-3] We now prove (12). We assume that (12) is true for r < m - 2 and 
show that in consequence it is true for r-m- 1. We have that 

so that, on elimination of p n by the use of (7), 

(»+D_ («+!) («) _ a / v c* y 

Itn -2 »o MiJw - 2 - d 2a ( V2M4-2 *“ v i)^m~V + f w-2 “ w*-a) + 0 (// m ™ 2 ). 

We can replace by ^ since J/«£L S by our assumption 

that (11) is true for r < ni - 1 and so have that 

(» + l) (»+X),,(# + 1) _S / \ B , 

V * m ~ S V « ^m-2-^s(V2 m - a -V 0 ^ w _ a )+8 aa (ft» 2m _ a -< 0oM2 „ 1 _ 2 ) + C>( W m 2). ( X4 ) 

Similarly 

a>2m - 2 ~ w o + V2» + - 1 a=8 23 (v 2w _ 2 - i'„p. 2m _ 2 ) + S 33 (ce 2 „,_ 2 -a%/x 2m ,_ !i ) + <9(«™“ 2 ). (15) 

Now write 


4m-2- 4 J Wm-2 : =AV, & 4 SL* - O) 0 /4»-2 =y, and Vy = [ 4 ,, y 3 ]. 

Then with as before (14) and (15) can be combined in the vector equation 

(16) 

where «, +1 is a vector whose elements are From (16) we have that 

Sn ~@n"^ 8 n~lT) n + g n _ 2 Z) M _j^? n 4 - . . . +fl r ( ,/ 7 j/?„ . . , (17) 

Thus (12) follows if 

T> n + T> n _ x D n 4- . , . 4.Z? X Z> 2 . . . D n -~ 0 { 1), 
and this is so by the arguments used earlier in § 5.2 and in § 4, 

[54] We now prove (13). The proof is analogous to that of (12): wc 
assume that (13) is true for r < m — 2 and show that it is then true for 
r=m- i. We then have from the recurrence relation (3) that 1 ,£+]{ is, 
apart from terms 0 (n m ~ z ), the coefficient of (it) im ~ 1 /(2m - 1)! in 

{1 -d( ? 2 , )}{(/ 12/tX2m _ s , + S 22 r 2m _ 2 + 8 82 w 2bi _ 2 )(*) 2 ““ 2 /( 2m -2)1 

■+ (A*M2m-X + §22 v 2 m~l + - i) 1}. 

Then using (4) to eliminate p l2 &nd also (12) with r — 1 we have that 
4 w-l - 4 V4m-1 = Kz( v 2m-1 - r 0 jj. im _ 1 ) 4- S; ! ,(co 2m „ 1 - CO 0 ^ 2m _ 1 ) 

-(2W - + 

By (7) we can write this equation in the form 
Lt 1 ] - 4 " +1 V&Li - (m - M n+1 ¥&-2 + 0 {n m ~ i ) = S 22 [(v 2m _ 1 - v 0/ a 2m _ 1 ) 

- (m - i)v 1 /x 2m _ 2 ] 4- 8 32 [(to 2m _ 1 - woi^m-x) ~ (aw - ( 18 ) 


vi 



for a Convergent Non-homogeneous Finite Markov Chain 119 

We now bring the left-hand side of this equation to its natural form much 
as in the proof of (12) in § $ * 3 • we use the result of § 5.1 that A _ 3 = 0 (n m ~ 2 ) 
and the result of § 5.5 below that Thus we can replace 

the left-hand side of (18) by 

*& + -i - - (2 m -1 M“ + W-i (19) 

without change of the order of the omitted terms. The corresponding 
equation for 

- w o” + 1 Vt + - 1 i - (2 m - 1 W? + 1 ) A + -l (20) 

follows similarly. We can express the two recurrence relations for (19) 
and (20) by a single vector equation identical in form with (16) above, 
and then complete the proof exactly as for (12) in § 5.3. 

[5.5] We have still to begin the process of induction but, apart from this, 

have given an inductive proof of (11) provided that = 0 (n m ~ 2 ). 

We now show that this is so provided that 

4 «S? + i- 0 (* r - 1 ) (21) 

for all r < m - 2. We have from (3) that i s > apart from terms 

0 (n m ~‘ l ) 1 the coefficient of -1)! in 

{1 - 2)1 

+ [>2m-l + ( 2 m- l)(>am-2Al + V 2m -Al + Wsm-aSji)](^) 2m ~V(2OT - 1 ) 1} 
if (l) and (2) are true for all r < m -2. On use of (4) we have that 
= ( 2W “ 1 )[( v 2m-2 ” + ( w 2m-2 ~ + 0 (n m ~ 2 ). 

We now use (12) for r—m- 1, and the result follows. (2) is an immediate 
consequence of (21). 

[5.6] To begin the process of induction we note that (11) is true for 
r — i (this is immediate), that (12) is true for r = i and that (13) is true for 
r- 2. But clearly (12) is true for r -1 from (17), and likewise for (13) 
with r= 2. This completes the proof of (1) and (2). 

6. The Proof for a Chain with k States 

The convergence of the distribution of xjn* to normality follows at 
once from (1) and (2) (Frdchet and Shohat 1931). We have given the 
details of our proof for the particular case of k = 3, but the extension of 
this proof to general values of k presents a problem of notation only. 
We have given in § 4 the details of the convergence of the variance in the 
general case, and in § 5 we have corresponding changes: in (16), for 
example, x } is now a (k - i)-dimensional vector and the D } are of order 
k-i. The inductive argument is unchanged. 



120 


The Central Limit Theorem 


Acknowledgment 

I should like to thank the referee for his close reading of an earlier 
draft of this note and in particular for a suggestion that led to an improve¬ 
ment of my notation. I am also indebted to him for bringing to my notice 
the papers of Dobrftsin that are now listed with the references below. 


References to Literature 

Dobr&sin, R. L., 1955. “Central Limit Theorem for Non-stationary Markov 
Chains”, C.R. Acad . Set. U.R<S>&\ } 162 , 5-8. 

-, 1956 a . “ On the Condition of the Central Limit Theorem for Inhomogeous 

Markov Chains”, C.R. Acad . Sci. U.R.S.S., 108 , 1004-1006. 

-, 1956A “Central Limit Theorem for Non-stationary Markov Chains”, 

Teor. Veroyatnostei , 1 , 72-89 and 365-425. 

Fr&chet, M,, 1938. Recherches thioriques mo denies sur la thfarie des 
probabilitis , Second Livre. Paris. 

Fr$chet, M., and Shohat, J., 1931, “A Proof of the Generalized Second 
Limit-theorem in the Theory of Probability”, Trans . Amer* Math . Sot\, 
33 , 533 - 543 * 

Mott, J. L., 1957. “Conditions for the Ergodicity of Non-homogeneous Finite 
Markov Chains”, Proc . Roy. Sac. Edin . A, 64 , 369-380. 


(Issued separately February 19, 1959) 



Propagation of Thermal Stresses in Thin Metallic Rods 12 1 


IX*— The Propagation of Thermal Stresses in Thin Metallic Rods.* 
By L N. Sneddon, The University of Glasgow. 

(MS. received April to, 1958. Revised MS. received August 7, 1958. 

Read October 27, 1958) 


Synopsis 

If the temperature in an elastic rod is not uniform and if it varies with time, dynamic 
thermal stresses are set up in the rod. This paper is concerned with the calculation of 
the distribution of temperature and stress in an elastic rod when its ends are subjected to 
mechanical or thermal disturbances. Simple waves in an infinite rod are first discussed 
and then boundary value problems for semi-infinite rods and rods of finite length. The 
paper concludes with an account of an approximate method of solving the equations of 
thermoelasticity. 


i. Introduction 

If the distribution of temperature in an elastic body is not uniform, 
dilatational changes take place which alter the distribution of stress in the 
body. On the other hand if dilatational waves are being propagated in 
an elastic body the local changes in volume will produce fluctuations of 
temperature in the body. Thermoelasticity, which is concerned with the 
study of the interplay of these two effects, has recently been the subject of 
several papers (Biot 1956; Lessen 1956 and 1957) although the basic 
equations of the subject have been known for quite some time (Duhamel 
1837; Voigt 1910, Jeffreys 1930). 

The present paper is concerned with what is probably the simplest 
physical system in which thermoelastic phenomena can occur—a thin 
metallic rod; in this case the basic equations assume their simplest forms. 
The solutions of problems relating to even this simple system are of 
physical interest and the methods used to obtain these solutions might be 
expected to have obvious generalisations in more complicated situations. 

After a brief account (in §§ 2, 3, 4) of the basic equations of thermo¬ 
elasticity with one space variable and of the system of units we employ, 
we give an account of the propagation of simple waves in an infinite rod 
whose cylindrical surface is impervious to heat. Exact and approximate 
expressions are derived for the phase velocity and the attenuation coefficients 
of these waves and numerical values for four metals (aluminium, copper, 

* This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 



122 


L N. Sneddon 


iron and lead) calculated over a wide range of the frequency of the wave. 
We introduce a frequency, <^>*, characteristic of the solid, defined by the 
equation co*~cE/k, and a constant e = a 2 ET/pc where r, E, k , a, and p are 
respectively the specific heat, the Young’s modulus, the conductivity, the 
coefficient of linear expansion and the density of the metal; Tis the absolute 
temperature of the rod in a state of zero strain. It is shown that if the 
frequency of the wave is less than co* the phase velocity is (i 4 - and 

the attenuation coefficient is J <*(2 - 5 €)(E/p)*(a) 2 l*co*); these results are the 
one-dimensional analogue of the results for the clilatational waves in a 
three-dimensional solid (Deresiewicz 1957; Chadwick and Sneddon 1958). 

The remainder of the paper is concerned with boundary value problems 
associated with semi-infinite rods of finite length. In § 6 we consider semi¬ 
infinite rods whose free ends are subjected to simple periodic disturbances 
and in § 7 make use of the theory of Laplace transforms to solve similar 
problems with disturbances of a quite general nature. The propagation 
of thermoelastic waves in finite rods is discussed briefly in § 8. The paper 
concludes with an account of a method of deriving approximate solutions 
of the basic equations. 


2. The Thermoelastic Equations 

Under free thermal expansion a thin rod experiences a thermal strain 
in the direction of the rod of amount 

€ a) ~ad, ( 2,1 ) 

where 6 denotes the temperature change from 7 \ the absolute temperature 
of the rod in a state of zero stress and strain, and a denotes the coefficient 
of linear expansion of the solid. It is assumed that 6 is so small that the 
thermal and elastic properties of the solid remain constant throughout 
the times in which we are interested. If we measure position along the 
rod by means of a co-ordinate x and denote the displacement of a point 
with co-ordinate x by u then the total strain at a typical point x is given by 


This total strain is made up of the thermal strain and the elastic strain € (a) 
which is given by the equation 




cr 



{ 2 . 3 ) 


where a denotes the stress at the point x and E denotes the Young’s 
modulus of the material of the rod. Substituting from equations (2.1), 



Propagation of Thermal Stresses in Thin Metallic Rods 


123 


(2.2) and (2.3) into the equation 


€ «€<« + €<*> 

(24) 

we obtain the equation 


du 



( 2 - 5 ) 

where 


y = aJS. 

(2.6) 


Equation (2.5) is the one-dimensional form of the Duhamel-Neumann law 
(Sokolnikoff 1956, p. 359). 

The thermodynamic variables describing the state of the elastic rod 
are the strain € and the absolute temperature T + 9 . It has been shown 
by Biot (1956) that the entropy per unit volume of an elastic rod is given by 


s^cp log. 



(2.7) 


where c is the specific heat per unit mass at constant strain (assumed 
independent of variations in the temperature in the vicinity of the 
equilibrium temperature T '), p is the density, and the additive constant 
involved in the definition of the entropy has been chosen so that the entropy 
is zero in the reference state. If 6 is small in comparison with T we find 
that 

cpd 

Sss —+y € 


so that the quantity of heat absorbed per unit volume of the solid in the 
course of small deformations and small variations in temperature is given 
by the formula 

du 

h = Ts — cpd + y 7 '—- (2.8) 


Now it is known from the theory of the linear conduction of heat in 
solids that when the surface of the rod is rendered impervious to heat, 
so that no radiation takes place, the equation for the variation of temperature 
takes the form 


dh _ m 

di~ k do^ 


(2.9) 


(Carslaw and Jaeger 1947, p. 6) where k is the thermal conductivity of the 
material of the rod. If we substitute from equation (2. 8 ) into equation 
(2.9) and introduce the diffusity 



124 


L N. Sneddon 


we can write equation (2.9) in the form 


dd~ d 2 9 t d*u 
dt K 8 x 2 dxdt 


(2.11) 


where y — yTj(pc). 

Equation (2.11) resembles the customary equation for the linear flow of 
heat except for the term y f (dhijdxdt). Duhamel (1837) included a 
similar term in his considerations but only because he postulated that the 
rate of change of dilatation of an elastic body would have a linear effect 
on the rate of change of temperature. Alternative proofs of equation 
(2.11) have been given by Voigt, (1910) and Jeffreys (1930) and more recently 
by Lessen (1953, 1956, 1957). 

To complete the set of basic equations we have the equation of motion 


da 

dx 


+ pF- 


d 2 u 

p w 


(2.12) 


where F denotes the body force per unit mass in the direction of the rod. 

The set of equations (2.5), (2.11) and (2.12) are sufficient, when taken 
with the appropriate boundary conditions, to determine the variation of 
temperature, stress and displacement along the rod when the body force F 
is prescribed. Wiener (1937) has proved that solutions of these equations 
are unique when 6 and u are specified on the ends of the rod and the initial 
distributions of 6 , u and du/Bt are prescribed; his proof can readily be 
extended to cover more general boundary value problems. 


3. Dimensionless Form of the Equations 

It is convenient to write the basic thermoelastic equations in 
dimensionless form. If we take a typical length l as our unit of length, 
a time r as our unit of time, the reference temperature T as our unit of 
temperature, and the Young’s modulus E as unit of stress, we find that 
equations (2.12), (2.5) and (2.11) respectively assume the forms 


da d z u 

— +X~a—> 
dx dt 2 

(3-x) 

du 

(3-2) 

8*6 _ 86 8*u 

dt +g 8x8t 

(3-3) 


where 



Propagation of Thermal Stresses in Thin Metallic Rods 


with U —(Ejp)h, the velocity of elastic waves in the rod. It should be 
noted that b is independent of the choice of l and r as also is 

bg o?ET 

e =7=-—- (3*5) 

/ P c 

If we are interested in low frequencies it is convenient to take / to be 
I cm. and r to be 1 sec. The values of a, b, f, g and e for four metals are 
shown in Table I; here we have taken l=i cm., r = 1 sec. and 
T —293°K ~ 20° C. 

Table I 



Aluminium 

Copper 

Iron 

Lead 

a . 

4-029 x xo- 12 

7*539 X I0~ 12 

5*984 x 10- 12 

7*063 x 10- 

b . 

7*62 X I0~ 8 

4*98 X I0~ 3 

1 *026 X I0~ 2 

9*67 x IQ- 1 

/ • • 

1*168 

0*899 

5-208 

4*152 

g > 

o*86o 

0*479 

3-536 

1*470 

e^bg/f . 

5*61 X X0“ 3 

2*65 X I0~ 3 

6-97 X IO-* 

3*42 X I0“ 3 


For high frequencies it is more appropriate to take 

1 . U 


where U denotes the elastic wave velocity and <o* denotes the frequency 

a>*= P jU\ ( 3 . 7 ) 

For this choice of units we have 

clJS 

a^x, f=i } g * —. (3*8) 




Table II 




Aluminium 

Copper 

Iron 

Lead 

km./sec.) 

5*09 

3*52 

5*21 

1*19 

co^sec.- 1 ) 

. 3*03 X I0 U 

I*II X IO 11 

1*43 X IO 12 

5*88 X IO 11 

/(cm.) . 

, 1*68 XIO~ 6 

3*16 x 10- 6 

3*64 X I0~ 7 

2-02 x 10- 7 

6 . 

. 7*62 X IO” 8 

4*98 x 10- 3 

1*03 x io~ 2 

9*67 X X0“* 

g * 

0*736 

°'S33 

0*679 

0*354 

co^sec,-*) 

. 7*90 X I0 13 

6*io x 10 13 

8*94 X IO 13 

2*05 X IO 13 


The values of a>* and l for aluminium, copper, iron and lead are given in 
Table II, Again it is assumed that 7 , =20° C. The range of frequencies 
actually obtainable in a solid is limited above by the cut-off frequency a) e 

F'K.S.E.—VOL, LXV, A, 1958-59, PART II 9 





126 


1 . N. Sneddon 


of the Debye spectrum. For longitudinal vibrations in a bar 

(3 - 9) 

(Brillouin 1938, p. 324), M being the mass of an atom of the metal forming 
the bar. The values of co c are included in Table II. 

In practical problems still another system of units may be employed. 
For instance, if we are considering the propagation of stress in a rod of 
length 1 metre, then it is obviously desirable to take /= io 2 cm. and we may 
choose 

T=IO t jU. ( 3 -IO) 


With this choice of units we find that we may write the equations in the 
forms 


0<T S 2 W 

dx~ dip’ 

(3.11) 

du 

(3-* 2 ) 

d 2 6 89 g 8 % u 

^ m Tt + Jdxdi 

(3>i3) 

where 


f„I._L xlo ~a b**aT 

S / pcU ’ / pc 

(3-14) 

all the physical quantities being measured in c.g.s. units. 

For the metals 


we have been considering we get the values of £ and r given in Table III; 

Table III 

Aluminium Copper Iron Lead 

r (sec.) . . 1-97 x 10- 4 2-84 x io -1 1-92 xio - * 840 x io -1 

£ . . . x-68 X IO" 8 3-16 X io -8 3-68 X IO~® 2*02 X IO" 8 

in this system of units g/f has the same value as g has in Table II, and b has 
the same value as in Tables I and II (on the assumption that 7"=20° C.). 

4. Radiation from a Rod 

Equation (2.11) governs the flow of heat in the rod when the surface 
of the rod is rendered impervious to heat. The equation has to be modified 
when radiation takes place into a medium at constant temperature, T say. 



Propagation of Thermal Stresses in Thin Metallic Rods 127 


The conduction equation (2.9) must then be replaced by the equation 



8*6 


( 4 - 1 ) 


where A is the same cross-sectional area of the rod, p is the perimeter of 
a cross-section and H is the emissivity of the surface. Because of this 
equation (2.11) must be replaced by 


86 ^ 8*8 f 8 2 u 
8 t 8 x 2 ^ dxdt 




(4.2) 


where 


Hp 

cpA 


This means that in turn equation (3.3) must be replaced by 


m 88 8 2 u 

+g 8x8t +J ’ 


( 4 - 3 ) 


( 4 - 4 ) 


where / and g have the same values as before and 

Hpl* 

j =~aT 


( 4 . 5 ) 


5. Simple Waves in an Infinite Rod 

We shall consider first of all the propagation of waves in a rod whose 
surface is impervious to heat. If we put X=o in equation (3.1) and put 
each of the quantities u, o, 6 proportional to e imt , then equations (3.1), (3.2) 
and (3.3) become 


Do** - aco 2 u, 

(S-i) 

a = Du - 88, 

(S-2) 

D 2 6 - ioj/8 -f it ogDuy 

(S-3) 


where D—d\dx. 
the equation 


It follows that each of the independent variables satisfies 

O (f>*=Q, 

-b 

- {D* - io>f) 


aa> 2 

D 

D 

-1 

icogD 

0 



128 


I. N. Sneddon 


which may be written in the form 

where jiq, $ are the roots of the quadratic equation 


(ju 2 + aa> 2 )( ft 2 - fo>/) - ito&gfi 2 ~ o. 


If we write 


( 54 ) 


A 2 » [aoj 2 - /(/+ /^)o>] 2 + 4iafoP, 

( 5 - 5 ) 

then we have 


/x 2 = it ~ + /o)(/+ <$£*) + -d], 

( 5 - 6 ) 

= M “ *<*> 2 +**>(/+ - 4] 

( 5 - 7 ) 

and the linearly independent solutions of equation (54) 

are the set of 


equations (3.1), (3.2) and (3.3) therefore have solutions of the form 


u\ (uM~ 


W 




- -f ia>l 


0- 4 

\<W 


where the u u o*(Y= x, 2, 3, 4) are constants. Of these twelve constants 
only four can be chosen arbitrarily; the remaining eight must be chosen 
in such a way that the equations (5.1), (5.2) and (5.3) are satisfied identically. 

It is obvious from an examination of equations (5.6) and (5.7) that /x 2 
is the root which corresponds to the longitudinal elastic wave. Now the 
phase velocity of the wave <f> = <^ 0 exp (- fi 2 x + zcot) is given by the equation 




CO 

3 W 


( 5 * 8 ) 


and the attenuation constant is 

<7=3l(^ 2 ). (5.9) 

In problems of this kind the most appropriate units are those of Table II. 
Putting a —j— 1, 6 g=a 2 ET/pc=e in equations (5.6) and (5.7) we then 
find that 


J[-a> # +*(x +e)a)+A], )«!«■$[-a> a +*(i + 4 )<j)-A], (5.10) 

where 


A 2 «[co 2 - i (1 + e)] a + 42a) 3 . (5. x x) 

Remembering that, in these units, the unit of velocity is U and the unit of 



Propagation of Thermal Stresses in Thin Metallic Rods 129 


length is U/co* we find from equations (5.8) and (5.9) that 


V- 


3 W 


U, 


S U 

? = 91 (> 2 )—, 


(5-12) 


where is given by the second of equations (5.10). 

The algebraic expressions for the roots fq, ju 2 would be very cumbersome 
in the general case but it is a simple matter to approximate to them if a> 
is either very small or very large in comparison with unity (i.e. in comparison 
with co*). If at < 1 in this system of units then it is easily shown that 
jiti and take the approximate values defined by the equations 

* 4 ® = ±(£«)*0 +|e)(x +4 M ( 2 0) = ±{le(i -fe) + (x -§e>'o>}, (5.13) 


where it has been assumed that e also is small. Similarly if 00 > 1 in this 
system it can be shown that fjL x and take the approximate values 1u%°\ 
jbt2° 0) where 




X --1 V 
20 )/ 


+ ZIH-' 


/ 4 °° ) = ±(Je + fco). (5-14) 


It follows from equations (5.13), (5.14) and (5.12) that in “ordinary” units 

(1 ~ |e)? M (w/a>*) 2 , a) < a)*, 

YoO> 




6 J 


( 5 - 15 ) 


where 





(S-i6) 


and from equation (5.12) that for « < w* 

V=(i+ie)U, 


( 5 - 17 ) 


while for co > a>*, V= U. 

The attenuation constant q is therefore an increasing function of 
the frequency w of the waves varying like oP for low frequencies and 
approaching the value q x asymptotically as w->-oo. Equation (5.17) 
shows that for low frequencies, i.e. for frequencies less than io 10 sec. -1 , 
the phase velocity of the longitudinal elastic waves is (1 +£e) times that of 
the elastic wave in a medium not exhibiting a thermal effect. This result 
can be interpreted in a different way. If measurements of the velocity of 
longitudinal waves in rods were used to determine the Young’s modulus 
of the material of the rods then the “dynamical” value E d of the Young’s 
modulus would be related to the statical value E through the equation 

E d =(i+e)E. 



130 


I. N. Sneddon 


From the values given in Table I (p. 12 5) we see that in the case of aluminium 
the dynamical value will be o-6 per cent higher than the statical value, 
for copper it will be 0-3 per cent higher, for iron 0*7 per cent higher and 
for lead 0-3 per cent higher. The effect of the thermal properties of the 
bar on these values is therefore rather slight. 

Table IV 


V\U 


to/co* 



Aluminium 

Copper 

Iron 

Lead 

IO -2 . 

. 1-0028 

1*0013 

1*0035 

1*0017 

IO- 1 . 

, 1-0028 

1-ooi 3 

1*0034 

1*0017 

I 

1*0014 

1 -0007 

1*0018 

1*0009 

IO 

1 -0000 

I *0000 

1 *0000 

1 -oooo 

IO 2 

1 *0000 

I -0000 

1-0000 

1*0000 

I + Jfi • 

. 1 -0028 

1*0013 

1*0035 

1-0017 


In order to test the validity of these arguments a series of calculations 
based on equations (5.6), (5.7), (5.8) and (5.9) was carried out for the four 
metals listed in Table I for a range of frequencies extending from io _11 a»* 
to io 3 a>*. A programme for carrying out these calculations on DEUCE, 
the high-speed electronic computer installed at the National Physical 
Laboratory, Teddington, was devised by Mr J. H. Wilkinson of the 
Mathematics Division of that Laboratory. Some of the results obtained 
from this programme are shown in Tables IV and V. It will be observed 

Table V 



Aluminium 

Copper 

Iron 

Lead ^ 

io- 2 . 

. 0*9846 x 10- 4 

0*9917 x 10- 4 

0*9815 X I0~ 4 

0*9910 X I0~ 4 

io- 1 . 

. 0*9863x10-® 

0*9837 X IO ” 2 

0*9729 x io- 2 

0*9828 x io- 2 

I 

0-499 1 

0*4996 

0*4988 

0*5000 

10 

0-9899 

0*9901 

0*9896 

0*9910 

IO 2 

o-999S 

0*9999 

0*9993 

I'OOOO 

I-f B . 

. 0*9860 

0*9933 

0*9826 

0*9914 


from Table IV that if aiju>* == icr 2 then V =(1 + |e) U; further calculations 
on DEUCE (not reported in full here) confirm that over the range 
io -11 < eo/o>* < io~ 2 the phase velocity of longitudinal elastic waves in 
the rod has this constant value. When to=co* the phase velocity of these 
waves is almost exactly (1 + ie) U and when co > ioo»* it falls to the value 
U, independent of e. Table V shows the values of qjq n for the same range 
of frequencies; for higher values of a> it was found that q—q^ while for 
w < 10 ~ 2 to* it was found that q was given accurately by equation (5.15). 







Propagation of Thermal Stresses in Thin Metallic Rods 131 

It is obvious from these results that there is a sharp change in the 
values of V and q in the vicinity of the frequency co*. It would seem 
therefore that the frequency o>*, which was introduced for the purely 
mathematical reason that its choice put the basic equations in a simple 
form, has in fact a definite physical significance. 


6 . Semi-infinite Rods: Simple Boundary Conditions 

We shall consider now the state of stress and the distribution of tem¬ 
perature in the semi-infinite rod x > o when the end #=o is disturbed by 
the application of stresses or temperatures which have a time dependence 
of the sinusoidal form e mt where o> is real. 

If we assume that the amplitudes of the thermal and elastic waves do 
not increase indefinitely as # 00 then we must take values of p lf with 

negative real parts. It is readily verified that if fi x and are given by 
equations (5.10), negative real parts being selected, then the expressions 


u(x } t) 


\hf> 


fx t + aw 


| jM 
nl +aco 2 


Cp x+M , 


(6.1) 


aix, t) «= 


- bau> 2 


’ c x 

jj.l + au> 2 


gttjX+iwt 


^2 

e ,, 

^2 + CIO ) 6 


6(x, i) = Cp* +i<at + C^ m , 


(6.2) 


(6.3; 


satisfy the basic equations (3.1), (3.2) and (3.3) with X=o. These solutions 
lead to the boundary values 


u(o, t}-- 


C \qbC x 


/jq + au? + a(x>i 


J<ot 


<r(o, t) = - baoP 


fxl + aco* txl + aaP 
0(o,*) = (Ci + Qe tot , 


Jl<Ot 


89 (o, t) 
8x 


— (fhCi + thd} 6 * 0 *' 


(6.4) 


( 6 - 5 ) 

( 6 . 6 ) 

( 6 . 7 ) 


Case (*) : 


6(0. t) = 0e 1<o! , 0(0, t) = o. 


For this set of boundary conditions we must choose C x and C 2 so that 
C x + C 2 =©, (|uj + a<x> 2 )C 1 + (pi + aw 2 )C 2 = o, 



132 


I. N. Sneddon 


from which it follows that 


^ Mi + aco 2 ^ ^ m + aa>~^ 
W “ 2 2 < -'2“~ 2 2 ™* 

/*8-^i H- S-K 


ju| + at u 2 


so that the desired solution is 


u(x, /)= —s-( /jtie /' 1 «- «_ / , e /v+. wl) _ 

h 2 -Pi 

r ttA 

o(x, t) = lffj±( e ^+<'“ ( -c^+W), 

#(*, /) = —+ aw 2 )^ 1 *+ fol - + tfw 2 )e"> a:+i “ I ]. 


In particular 


Case (ii): 


b( h) 

«(o, /)=-—-e to! . 

/^l + ^a 


9( o,?) = ©e to ‘, «(o,i)=-o. 


In a similar way we can show that the solution of this boundary value 
problem is 

*(*» *) - ~7-T~^-( C ^-! to .(jtf+tol) (6.12) 

9{x,t)= - +tot - ^i/4 + aa> 2 )d l ' z+ial ) ---. (6, I4 ) 

In particular 

<7(0,/) =- i e fal . 

fJLxfJL^ - <ZGU 2 

Since /x ^4 = -iafoP it follows that in this case 

~j (!+*>". (6.I S ) 

CW (m).* 

v(o, t) =ne im< , 8d(o, i)jdx = o. 

In this instance the constants C x and C 2 are determined by the equations 



Propagation of Thermal Stresses in Thin Metallic Rods 133 


so that 




c x , c, n 

^0-1 + tfO) 2 /X2 + <Z6t> 2 &ZO) 2 


C 2 I! (ju| + <2o> 2 ) (/x| + aoP) 

M 2 /x x ^co(^x - jLta) (jU-1 + ftijU 2 + 14 + #&> a ) 


(6.16) 


The solution of the boundary value problem is given by equations 
(6.1), (6.2) and (6,3) with C x and C 2 given by the equations (6.16). In 
particular we have the boundary expressions 


0( o ,i)«-n 


0 *?+aco*) 04 + «*>*) 

e , 


#(o, /) = ~ II 


i>aoD 2 ({j? L + f x x /x 2 + + ^ 2 ) 

MlM2(Ml + Mss) 




ato' z (/J.l + /LtjjUa + /4 + «6o 2 ) 


(6.17) 

(6.18) 


Case (w) : 

a(o, t) = He i<ot , 6 (o, /) = o. 
The solution of this problem is 

n 


1 /(X, t) : 


<r(&, /) 


?-/ 4 ) 

n 




5^404 + ~ M + aa> a )^ +to! }, 

{04 + - 0**+^ 2 )e^ +i<Bt } J 


0 (x t) = 1 a Si )0*2 +to* _ ^1*+to*j 

^o)*o*!-a 4 ) 


whence 


Case (v): 


«(o, 0 = - 


(/Xj/aa - aoj^FIe' 
aa) 2 ^! +/X2) 


ih)t 


u(p,t)—Az mt , d 8 jdx = o. 

The solution of this boundary value problem is 

u(x, t) - -,^04 + "W*** ~ 04 + 

w-w 

•too- 

M2~ Ml l Ml ^2 J 

«(*, i) P^ ‘ -fcf V»*+ to ‘-/xj-V“*+ fot ) 


(6.19) 

(6.20) 

(6.21) 

(6.22) 


(6.23) 

(6.24) 

(6.25) 




134 


I. N. Sneddon 


from which we have 

, , A a <^ A (A + + A + aeo \imt 

°-(o, t) - - 7 " , ; t - e » 

WWi+W 

«, , AA+^)(A+ a(i) 2 )j at 

^-W*>+*> ' 


(6.26) 

(6.27) 


Cai'g (©*).• 


w(o ,/) = Ae mt , 6 (o,t)~a. 


In this case we have' 
A 


u{x, /) ■ 


a(x,t)= - 


and hence 


aw 2 A 


{^(l 4 + aoj^ x+iat - ^(/4 + ^)e^ +to ‘} (6.28) 




<04 + «co 2 )e ft * +i ® < - 04 +«v 2 )e' , '* +to( }, (6.29) 


/)(6.30) 

Kv- rft)w»-») 


cr(o, /)= - 


a<v 3 (/x 1 + /x 2 )^e ,<u< 


( 6 - 3 *; 


The reciprocal nature of the pair of equations (6.31) and (6.22) will be 
observed; the pair (6.18) and (6.26) are similarly related. 


7. Semi-infinite Rods: Solution by the Laplace Transform 

In this section we shall consider the conditions in the semi-in fin ite 
rod x > o when the end a?=o is subjected to certain stresses and thermal 
conditions. It is assumed that at t=o the stress and displacement are 
zero as is the deviation 6 of the temperature from the reference temperature. 

If we assume that there are no body forces present so that we may 
put X—o in equation (3.1) then it is readily seen that the set of equations 
( 3 * 0 » (3*2) and (3.3) is equivalent to the set of ordinary differential equations 

Do = as 2 Uj ( 7 **) 

d — Du-h§, ( 7 , 2 ) 

D 2 6 = sfB+gsDu (7,3) 



Propagation of Thermal Stresses in Thin Metallic Rods 135 


(D — djdx ), for the Laplace transforms 


rao 

(w, a, $)=J («, 0*, 6 )z~ st dt 

0 

(74) 

of u , 0* and 0 . It is readily seen that each of these quantities satisfies the 

fourth order ordinary differential equation 



(7-5) 

where 


k{ + Kg« as 2 + (/+ bg)s } k\k\ » 

(7.6) 

On the assumption that a, u and 9 all tend to zero 

(or at worst remain 

finite) as x -»■ oo we take solutions of equations (7.1), (7.2) and (7.3) of the 

form 


_ N Jn x C x e~ KlX , 

jr) — — W g 2 2 2 f 9 

{ k{ - as 2 *<2 - J 

( 7 - 7 ) 

fC,e~ w 

cr(^, = 2 + 2 >> 

- err Kg - ar J 

( 7 - 8 ) 


6(x, s)~C -** + < 7 ^-**, 

where it is assumed that SR( k x ) and 3 t(/c 2 ) are both positive, and C x and C % 
are determined from the conditions at x=o. To facilitate the calculation 
of Ci and C 2 from the boundary conditions we have the relations 


-/ x / f ^2^2 \ 

«(o, j)« s .+ 2 2 r 

(kJ - as* /<2 - 

( 7 - 9 ) 

, r J c, . c 2 1 

(7.10) 

0 (o, a= Cj + 

( 7 -n) 

*r) sa **“ 

(7.12) 


It is obvious from this set of equations that if the boundary values of any 
two of the four quantities u, d, 8, D 9 are prescribed those of the remaining 
two are uniquely determined. 

For instance, suppose that the stress is prescribed to have the value II(/) 
on the free end and it is assumed that there is no flux of heat across this 
end. Then <7(0, t) — and 88(0, t)jdx =0. From these conditions it 
follows immediately that D8(o, s)— o and that a(o, r) = !!($•), where II(j) is 
the Laplace transform of the prescribed function H(/). Substituting these 
values in equations (7.10) and (7.12) and solving for C x and C 2 we find that 

Cl _c%'_ gTUs) 

- *2 *1 ( K 2~ K l ) [(/+ b S) + 



136 


L N. Sneddon 


Substituting these values in equation (7.11) we obtain the relation 

a, \ 

S(o,s) ~ v+ig)+W 

If we choose i/od* as our unit of time and U/co* as our unit of length then 
a ~f— I and ^ = e so that we have 

gU(s) 


6(0, s) = 


+ a 


a— 1 +«. 


Now if we write 


we find that 


F(a, i)=7r~ i r i -ac aH Erfc (a/ 1 ) 


( 7 -^ 3 ) 

(7-i4) 


F(a, s ) = -7 - 

r+ a 

(Erdelyi et al. 1954, vol. 1, p. 233) so that making use of the Faltung 
theorem for Laplace transforms (Sneddon 1951, p. 31) we have the equation 

6(0, t )- -g\ n(f)F(t + «,/-( 7 . 15 ) 

0 

by means of which to determine the variation of temperature at the free 
end (x =0) of the rod. 

For example, if in conventional units 




- HFo , o </<4 


/>/, 


n 


so that in our present units we have 

' -A, o<t< 4o>« 


nco- 


/>4<o* 


and it can be shown that 


6(0 ,/) = 


Af, 


[1 - e a,i Erfc (a/ 4 )], o < t < 4«» 


Erfc{a(/-4a,*)*}-e“*‘ Erfc (a#*)], / >4*,# 

Returning to conventional units we have 

Erfc (yr 4 )}, o < t < 4, 

AAnfe**" 0 Erfc [y(/~4) 4 ] -e yS ‘ Erfc (y/ 4 )}, 


0(°, 4= 


* > 4» 


(7.16 



Propagation of Thermal Stresses in Thin Metallic Rods 


137 


where 

~ y 2 = (i + e) 2 ci>*. (7.17) 

I + € 

Values of 8 m and y for the metals we have considered previously are given 
in Table VI for a reference temperature of 20° C. 

Table VI 

Aluminium Copper Iron Lead 

0 m (°K.) . . . 214*4 155*6 197*5 103-4 

y(sec.”* 4 ) , . . 5*53 x 10 5 3*34x10° 1-20x10° 7-69x10° 

m ~ kO ln / y (° K t sec. 4 ) . 2*19 x io~ 4 2*63 x io~ 4 9-26 x io~° 7-58 x io~° 

Since y is very large we may use the asymptotic expansion 

(7 -' 8) 


to obtain the formulae 


0(o, t)-- 




*>h- 


Furthermore if t > t x wc have the approximate formula 


yfaA)* V 


(7.19) 


(7.20) 


8. Finite Robs 

Similar methods may be employed to determine the state of stress and 
the distribution of temperature in the finite rod o < x < L There is a 
variety of thermoelastic problems associated with such a finite rod, the 
solution of each depending on the boundary conditions imposed. 

In the first instance consider the vibrations possible in the rod when 
the ends #=0 and x~l are subjected to the conditions 

dd 

«-0, £ -o. (8.1) 

It is readily verified that if and /x 2 are given by equations ($.10), then the 
expression 

cr = [cr x sinh \i x x + cr 2 sinh \i % x -f a 3 (cosh fi x x - cosh fi z x)]e im \ (8.2) 



I3» 


/. N. Sneddon 


in which cr u cr 2 , <r 3 are constants satisfies the equation (5.4) and the condition 
that 0=0 when x=o, From the relation 

abo^B ** - (-£>* + aco*)ff 

we deduce that the corresponding expression for 9 is given by the equation 


0 = 


-—+ aw 2 ) sinh tux + or 2 (^l + «w*) sinh (x 2 x 

aboS 1 

+ <r 3 [(/x? + aid 2 ) cosh fx x x ~ (/4 + ««*) cosh 


(8-3) 


It follows immediately that the expressions (8.2) and (8.3) satisfy the 
boundary conditions (8.1) at x—o and x~l provided that 

/q(/4 4- aop-)a x + ^ 2 (/x 2 + aw*)oj — o, 

^(p, 2 + aw 2 )a 1 cosh jtq /+ [x 2 ([xl + <2w 2 )cr 2 cosh [xj 

+ [fri/xl + aco s ) sinh fx x l - fx z (/xl + aw 2 ) sinh /z. 2 /J<r 3 =» 0, 
cr x sinh fi x / + <r 2 sinh \x 2 l + or 3 (cosh jitj/ - cosh /c 3 /) — o. 

Eliminating the constants oq, cr 2 , cr 3 from these equations we see that 
solutions of the type (8.2) and (8.3) are possible if co, is a root of the 
transcendental equation 

faQxi + aco 2 ) /x 2 (/4 + aw 2 ) 0 

fx x (jUi+aco 2 ) cosh ju x / +aw 2 ) cosh /x 2 / ^(/a* + aco 2 ) sinh /q/ - ju 3 (p| + aco*) sinh ju. a / 

sinh ju, x / sinh ju 2 / cosh ju x / - cosh /c„/ 

where p. x and ju, 2 are defined in terms of w by the equations (5.10). 
Expanding the determinant on the left-hand side of this equation we find 
that the frequency equation becomes 

2/i]jLi 2 (/xi+aco 2 )(jaa+aw 2 )(cosh fx x / cosh \x 2 l- 1) 

= [f 4(/4 + aoj2 y + lAilA + 0!W 2 )*] sinh fx x / sinh \x % l. (8.4) 


am 0, 


The method of the Laplace transform can also be applied to problems 
concerning rods of finite length /. Suppose, for instance, that we have 
the following boundary conditions for the stress a(x, t) and the temperature 
variation 0 (x, i): 

a(o, 0-11(0, 

a(l, 0 — o, 

then we may take a solution of equation (7,5) in the form 
?(*, s)=A sinh k x ( 1 - x)+£ sinh k 2 (/ -x) + C cosh k x (I -x)+D cosh k 2 ( 1 - x), 


88 ( o, 0 
00 ! 

89 ( 1 ,0 


( 8 . 5 ) 



Pto/higatfon 0/ Thetmal Stresses in Thin Metallic Rods 


*39 


where k u /< 2 are defined by the equations (7.6). The corresponding 
expression for ihe Laplace transform of the temperature variation is 

- o1ks 2 8(x, s) (dj - as 3 )[A sinh k x (1 - x) + C cosh k x (1 - x)] 

4 (t<l - as 2 )\£ sinh k 2 ( 1 ~x) cosh k 2 (/~ x)]. 

If these forms are to satisfy the boundary conditions (8.5) we must choose 
//, R, ( \ D so <hat they satisfy the equations 

A sinh k x I -I- £ sinh k 2 1 l C cosh kJ l- D cosh /< 2 /«. fl (s), 


('+/) -o, 


- aiP)(A cosh Kjl 4 C ’ sinh fq/) *1 /c 2 (/<: 2 - as 2 )(£ cosh /c 2 /4- £> sinh k 2 /) ~ o, 
/q(/<i as 2 )A ^k^kI* as 2 )£ o. 

Solving these equations w(* find that 

A * * k 2 (k 2 - as 2 )(/>, £ ~~ ~ k x (k\ - as 2 )<j) > J9 = -tftj 


where 


<j> [/q(*cj - &r 2 ) sinh /q/ - k 2 (k 2 - as 2 ) sinh /c 2 /] 

*// ** -[#cj|if,(#c? - «?.**)(«$ ff.v 2 ) (cosh/q/-cosh #c a /)} 


nw 

nco 


with 


<d(.r) 2 K l K. i (l<‘ l «*)(*“ •<M a )(oOhh /fj/cOhh /C 2 / — l) 

• -«j a ) 2 -| ^(/f® --tfJ 4 ) 2 ] sinh Kj/sinh /c 2 /. (B. 6 ) 

The relation of the function d(y) defined by equation (8.6) to the frequency 
equation (8.4) is obvious. 


9. Approximate Solutions 

In this section we shall consider a method of obtaining approximate 
solutions to the complete set of equations (3.1) to (3.3). 

In some cases it is a simple matter to find a set of functions (cr 0) u 0 , 8 0 ) 
satisfying the boundary conditions of the problem and the “unlinked” 
equations 


8a 0 8\ 

«• * »« ) 

8x 8t 2 

( 9 . 1 ) 

8u 0 

* m 9i' 

(9.2) 

8% 88 * 

8x 2 =/ 8 /' 

( 9 - 3 ) 



140 


I. N. Sneddon 


From these solutions we may determine an approximate solution of the 
set (3.x) to (3.3). Let 

00 

(«» <r» = 'r, W, (9-4) 

rs=0 

where (w 0 , tr 0 , 0 O ) satisfy equations (9.1) to (9.3); then, substituting in 
equations (3.1) to (3.3) we find that (u„ v n 0 r ) satisfy the equations 


U“Vr _ vJar 

( 9 - 5 ) 

/a 2 

\e.x2 rJ// r ' 0* ’ 

( 9 - 6 ) 

/A 

s~ w 9 « 

( 9 - 7 ) 


(r > 1), and the relevant physical quantities vanish on the boundary of the 
rod. 

As an example of this method, suppose that we wish to find the approxi¬ 
mation solution of the problem considered in case (/) of § 6 . In this case 
the approximate solutions of equations (9.1), (9.2) and (9.3) would be 

0„»=©c -a+**/*>*» «„ « a,, ■* 0 . 

We therefore have 

which has solution 

Ui-A**-** - (j0Wa,)t(l 

au > 1 + if m 

Inserting this expression in equation (8.7) we find that 

afx, t)= ---— 

aco 2 +ifcn 

and the boundary conditions demand that 0^(0, #) =0 so that we have 

j 

aor 4 - tfco 

We therefore obtain the approximate solution 

u(x, t) =gu 1 (x, t) 

m »)*(x +09 

aaf + ifco auf+ifo) 


( 9 - 8 ) 



Propagation of Thermal Stresses in Thin Metallic Rods 141 


For this solution wc have the boundary value 

1 -z[i +(2aaj/ff] 


U(p ) t) == - 


2 

/co 


£©~ 


i +fi +(2 aa>/f)*] 

Comparing this solution with the exact solution (6.11) we find that 




I ^(°7 t ) j 


* V 


Ml + M2 I 


! «(o, /) I \/w/ [l +{l +(2«tt>//) i } 2 ]* 


( 9 - 9 ) 


(9.10) 


To test, the validity of this approximation the values of x(<o) for 
aluminium wen; examined over a wide range of frequencies. The results 
of this numerical investigation are shown in Table VII. It will be observed 


Table VII 


(!)fw* . . IO ~ X0 IO ~ 8 IO“® IO *" 4 IO ” 2 I IO 2 

%((*>) . . 1-0028 1-0028 1*0028 1*0028 1*0024 1*0006 1*0000 


that over a very wide frequency range the function x(e>) does not differ from 
unity by more than 0*0028 showing the approximate solution obtained by 
this crude method is accurate to within about one quarter of one per cent. 


References to Literature 

Biot, M. A., 1956. “ Thermoelasticity and Irreversible Thermodynamics ”, 
/. AppL Phys., 27, 240. 

Briluwin, L., 1938, Tenseurs cn Micanique et en MlasticiU. Paris: Masson 
et Cie. 

Carslaw, XL S., and Jaeger, J* C., 1947. The Conduction of Heat in Solids , 
Oxford University Press* 

Chadwick, P,, and Sneddon, I. N., 1958. “Plane Waves in an Elastic Solid 
Conducting Heat”,/. Meek. Phys . Solids , 6, 223. 

Dkrksxkwicz, H., 1957. “Plane Waves in a Thermoelastic Solid”,/. Acoust, 
Soc. Amer 29, 204. 

Duhamel, J. M. C., 1837. “Second Mdmoire sur les Ph&iomenes Thermo- 
Micaniquc”,/. Me. Polyt. Paris , 15, x. 

ErdjJdyi, A. {Edit), 1954. Tables of Integral Transforms . VoL x. New York • 
McGraw-Hill Book Co. 

Jeffreys, H., 1930. “The Thermodynamics of an Elastic Solid”, Proc. Camb. 
Phil. Soc 26, 101. 

Lessen, M., 1956. “Thermoelasticity and Thermal Shock”, /. Mech. Phys. 
Solids , 5, 57* 

-, 1957. “The Motion of a Thermoelastic Solid”, Qimp/. 4 ^- Math., 15, 

X05. 

R.R.S.B.—-VOL. LXV, A, X95&~59, ^RT U XO 



142 Propagation of Thermal Stresses in 7 Inn Metallic Rods 

Sokolnikoff, I. S., 1956. The Mathematical Theory of Elasticity. New York: 
McGraw-Hill Book Co. 

Sneddon, L N., 1951. Fourier Transforms . New York: McGraw-Hill 
Book Co, 

Voigt, W., 1910. Lehrbuch dor Kristallphysik. Berlin: Teuhner-Verktg. 
Weiner, J. H., 1957. “A Uniqueness Theorem for the Coupled Thermoelastic 
Problem”, Quart . AppL Math 15, 102. 


(Issued separately February 19, 1959) 



Stresses Produced in Elastic Bodies by Uneven Heating 143 


X.—The Dynamic Stresses Produced in Elastic Bodies by 
Uneven Heating. By G. Eason, King’s College, Newcastle 
upon Tyne, and I. N. Sneddon,* The University of Glasgow. 

(MS. received April 10, 1958. Read October 27, 1958) 


Synopsis 

The presence of a non-uniform distribution of temperature in an elastic solid gives 
rise to an additional term in the generalized Hooke’s Law connecting the stress and 
strain tensors and to a term involving the time rate of change of the dilatation in the 
equation governing the conduction of heat in the solid. The present paper is concerned 
with the effects produced by these additional terms in two simple situations. In the 
first, the elastic solid is regarded as being of infinite extent and the distribution of tempera¬ 
ture in the solid is produced by heat sources whose strength may vary with time. In the 
second, the solid is supposed to be semi-infinite and to be deformed by prescribed variations 
in the temperature of the bounding plane and by heat sources within itself. 


CONTENTS 

PAGE 

L The Theory of Thermoelastic Disturbances.144 

1. Introduction. .144 

2. The Thermoelastic Equations.145 

3. Dimensionless Form of the Equations.148 


II. The Stresses Produced in an Infinite Elastic Solid by Uneven 


Heating.151 

4. The Solution of the Basic Equations.151 

5. The Steady State Solution.153 

6. The Two-dimensional Problem.155 

7. Axially Symmetrical Problems.156 

8. The Quasi-static Solution , . . . . . . . 158 

9. Solutions of Special Problems . , . . . . . 159 

(i) The Stress due to a Periodic Line Source . . . . 159 

(ii) The Effect of a Moving Line Source.163 

(iii) The Stress due to an Impulsive Line Source . . .165 

(iv) Impulsive Point Source of Heat.166 

III. The Stresses Produced in a Semi-infinite Solid by Uneven Heating 

of the Surface and by Internal Sources of Heat . . .167 

lo. The Solution of the Basic Equations *.167 

n. The Steady State Solution. 171 

12. The Solution of the Two-dimensional Problem .... 172 

13. The Effect of a Periodic Line Temperature applied to the Surface . 174 

IV. References to Literature . . . ' . . * 175 


* This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 












144 


G. Eason and I. N. Sneddon, The Dynamic Stresses 


I. The Theory of Thermoelastic Disturbances 
i. Introduction 

If there is a non-uniform distribution of temperature in an elastic solid 
then there are corresponding changes in volume elements throughout the 
solid. These changes in the strain distribution throughout the solid have 
two important consequences. First of all thermal strains are set up in the 
solid as a result of the expansions (or contractions) accompanying changes 
in temperature and these in turn affect the distribution of stress in the body. 
Furthermore some of the mechanical energy expended in producing local 
changes of volume in an elastic solid is converted into heat which is absorbed 
by the solid itself. Thermoelasticity is concerned with the analysis of the 
effect these interlinked phenomena have upon the distribution of stress and 
temperature in any given problem. 

The study of thermoelastic phenomena seems to have been begun over a 
century ago by Duhamel (1837) but the first attempt at a rigorous derivation 
of the basic equations came later with the work of Voigt (1910) and 
Jeffreys (1930). Interest in the fundamentals of the subject has recently 
been revived by the publications of papers by Biot (1956) and Lessen (1956, 
1957). Solutions of these equations have recently been considered by 
Deresiewicz (1957), Chadwick and Sneddon (1958), and Sneddon (1958). 

The present paper is concerned with the derivation of general solutions 
of the equations of thermoelasticity in two simple situations-—the first 
when the stresses are produced in an infinite solid by a given distribution 
of heat sources (Part II) and the second when t he stresses are produced in 
a semi-infinite solid by imposing a prescribed distribution of temperature 
on the plane boundary of the solid (Part III). Throughout it is assumed 
that the displacements and strains of the solid arc infinitesimal so that the 
classical theory of elasticity is applicable and that the fluctuations in the 
temperature within the solid are limited to a range within which the thermal 
and elastic constants of the material are independent of the temperature. 

The first sections (§§2, 3) give a brief account of the fundamental 
equations of thermoelasticity and of the systems of units which may be 
employed to cast them into a convenient dimensionless form. 

In § 4 we derive the general solution of the thermodastic equations for 
an infinite solid in which are distributed heat sources of prescribed strength; 
the method of solution, based on the use of four-dimensional Fourier 
transforms, is similar to that employed in a recent paper on the equations 
of dynamical elasticity (Eason, Fulton and Sneddon 1956). From this 
general solution we derive the solution appropriate to steady state problems 



Produced in Elastic Bodies by Uneven Heating 145 

(§ 5) find two-dimensional problems (§ 6). In § 7 we use a mixed Fourier- 
Hankel transform to derive the general solution of problems in which there 
is axial symmetry. In many problems (the quasi-static problems) it is 
possible to take one of the constants of the theory to be zero; the solutions 
appropriate to this assumption are discussed in § 8. We conclude our 
discussion of the infinite solid by considering (in § 9) the solutions of a 
number of special problems. In one of these (§ 9 (i)) it is possible to derive 
a formula by means of which we can estimate the closeness of the quasi- 
static approximation. 

Part III follows the same general pattern. In § 10 we derive the general 
solution of the thermoelastic equations for a semi-infinite solid which is 
deformed by the application to its plane boundary of a prescribed distribu¬ 
tion of temperature, and from it deduce the solutions appropriate to 
steady-state problems (§ 11) and to two-dimensional problems (§ 12). The 
paper ends with a brief discussion of a special problem—the determination 
of the effect produced by applying a periodic line source of temperature to 
the free surface. 


2. The Thermoelastic Equations 


Under free thermal expansion an isotropic body experiences a strain 
whose components y ( jj referred to a set of orthogonal cartesian axes 
0 (x v .tjj, x a ) are specified by the equation 

yty~a68g ( 2 - 1 ) 

in which 0 denotes the temperature changes from T, the temperature of 
the solid in a state of zero stress and strain, and a denotes the coefficient 
of linear expansion of the solid. It is assumed that 0 is sufficiently small 
for the thermal properties of the solid to remain constant throughout the 
times in which we are interested. In terms of the components, u it of the 
displacement vector, the total strain in the solid is given by the equation 

Yij - i(«<, 3 + U 3, i), ( 2 - 2 ) 

where denotes the partial derivative BuijdXj. This total strain is made 
up of the thermal strain and the elastic strain whose components ytf are 
specified by the equation (Sokolnikoff 1956) p- 66 ) 




?ti AO 8^ 

2jU. 2/4(3A + 2[l) 


( 2 - 3 ) 


where components of the stress tensor, 

G >==T tf 


(2.4) 



146 G. Eason and L N. Sneddon , The Dynamic Stresses 


is the sum of the principal stresses, A and /x are Lame’s elastic constants 
for the body. Substituting from equations (2.1) to (2.3) into the equation 

y^y^+y? 


we obtain the tensor equation 


y«= 


la 

2/X 


A 4 > 


2/x(^A + 2/x) 


-a0[S 


(a-S) 


Solving this tensor equation for the components of the stress tensor we 
find that 

t a =» (A /1 - yd) 8 W + 2 ny tjt (2.6) 

where 

A=y i{ —u ii( (2.7) 

denotes the dilatation in the solid and 


y = a(3A + 2/x). (2.8) 

The physical relationship expressed by the tensor equation (2.6) is called 
the Duharael-Neumann Law (Duhamel 1837; Neumann 1885). 

The thermodynamic variables describing the state of the elastic solid 
are the strain components (2.2) and the absolute temperature and it 

can be shown (Jeffreys 1930; Biot 1956) that the entropy s per unit volume 
of the solid is given by the equation 


s**cp log 



(=•9) 


where the additive constant, involved in the definition of the entropy, has 
been chosen so that the entropy is zero in the reference state. In this 
equation p is the density of the solid, c is the specific heat per unit mass at 
constant strain (assumed independent of temperature in the vicinity of the 
equilibrium temperature T), and y is defined by equation (2.8). If 6 is 
small in comparison with T we find that equation (2.9) gives the simple 
equation 





(2.10) 


for the entropy per unit volume, so that the quantity of heat absorbed by 
unit volume of the solid in the course of small deformations and small 
variations in temperature is given by the formula 

h = Ts=*pcd+yTA. (2.11) 



Produced in Elastic Bodies by Uneven Heating 147 

Now it is known from the theory of the conduction of heat in solids (Carslaw 
and Jaeger 1947, p. 6) that the variation of temperature within an isotropic 
solid is governed by the equation 

~=£V 8 0 +$-, (2.12) 

where 1 k is the heat conductivity of the solid, and q is the quantity of heat 
per unit volume generated in the solid. Substituting from equation (2.11) 
into equation (2.12) we find that the temperature variation 6 and the 
dilatation A are linked through the equation 

pc—+ yT~~ = kV^d + q. (2-13) 

If we introduce the diffusity 

k 

K-~t 

pc 

we can write this equation in the form 

?f-ieV *0 + e-Y~, (2.14) 

<)£ r dt ' 

where Q "<ll(P c ) and y'~-yTj(pc). 

To complete the set of basic equations we have the equations of motion 
in the form (Sokolnikoff 1956, p. 370) 


ru.j+pPi^ptit, ( 2 - 15 ) 

where (iq, F s , F„) denotes the body force at the point (x lt x it x^) and 
Hi denotes the f-th component, d^Ui/SF, of the acceleration of an infinitesimal 
element centred at the same point. It is of course assumed that the 
temperature changes involved are so small that the value of the elastic 
constants remain unaltered throughout the solid. 

The set of sixteen equations symbolized by equations (2.2), (2.6), (2.14) 
and (2.15) is sufficient, when taken with the appropriate boundary condi¬ 
tions, to determine the temperature variation and the components of 
stress and displacement when the heat sources are prescribed, Weiner (1957) 
has proved that solutions of these equations with Q—o, F t — o (sf—1, 2, 3) 
are unique when the initial values of 9 , and 8 u t /dt are all zero and 0 
and Ui are specified on the boundary of the solid being considered, but his 
proof can readily be extended to cover more general boundary conditions. 



148 G. Bason and I. N. Sneddon, The Dynamic Stresses 


3. Dimensionless Form of the Equations 

It is convenient to write the basic set of thermoelastic equations in 
dimensionless form (Sneddon and Berry 1958, p. 124). If we take a 
typical length / as our unit of length, a time r as our unit of time, the 
reference temperature T as the unit of temperature, and the rigidity 
modulus /x as unit of stress we find that the equations (2.15), (2,6) and 
(2.14) respectively take the dimensionless forms 



(3-i) 

r va ~ [(^ 2 — 2 )^ — + 2y, J)g , 

(3-a) 

„„„ ^ ft M 

V 2 6+ 

8t 8t 

(3-3) 

where /5 is the ratio (2 + A.//*)*, 


% Jt F 0-^ 

(3-4) 

define new source functions and 


/ l\* . yT , P y/ 2 

a = [ -)> 0 = -, /:=-, 

\v s t / fx kt hr 

( 3 -S) 

v s being the velocity of shear waves in the solid, i.e, (ju/p)* (Rullen 

1947, p. 21). 


In certain problems in which there is axial symmetry it is desirable to 
employ cylindrical polar co-ordinates r, z and <f> instead of cartesian 
co-ordinates. In the case of axial symmetry we may take the z axis to 
coincide with the axis of symmetry; the displacement vector then has 
physical components u and w in the r and z directions respectively 
and zero component normal to these directions. There are four non¬ 
vanishing physical components of stress which may be denoted, in the 
usual von Karman notation, by o>, a„, a t , r„ and which are related to 
u and w by the dimensionless equations 


(°Vj <V a,) - ((j8® - 2)41 - 16 ] + 2 

( 3 - 6 ) 

and 


du dw 

( 3 - 1 ) 


where the dilatation A takes the form 


. du u dw 


A 

dr r 8 z 

( 3 - 8 ) 



Produced in Elastic Bodies by Uneven Heating 


149 


in this system of co-ordinates. When transformed to cylindrical co¬ 
ordinates the equations of motion (3.1) assume, in the symmetrical case, 
the forms 

Scr r dr rt oy-cr 

9 + F r =au, 

dr vs r 

( 3 - 9 ) 

dr r . dcr, r« 

—1- - — + — + F„=aw 
dr as r 

( 3 -io) 

and the diffusion equation (3.3) may be written as 


c)W 1 30 3*0 . 80 3 A 

3 ? + rTr + ~ 87 * + & ~ f Ji + *K- 

( 3 -i 1) 


The values of the constants a, b, f and g occurring in these equations 
will depend upon the choice of the basic units of time and length. In 
some problems it may be convenient to choose / to be 1 cm. and r to be 
1 sec. The values of a , b, f and g for four common metals and this choice of 
/ and r are shown in Table I; here we have taken T to be 293°K. For 

Table I 


Values of the constants a, b, f and g for four common metals with 1 cm,, r=t 1 sec. 
and 93°K. 


a 

b 


f 






Aluminium Copper Iron 


Lead 


1*034 x 10 
0*0630 
m68 
2*687 

3*$6 X IO“ a 


2*166 X TO“ u 
0*0417 
0*899 
1-497 

1*68 X 10"“® 


1*532 X TO*” 11 

0*0089 

5*208 

8*035 

2*97 x io~* 


2*034 X TO”* 10 

0*2320 

4*152 

12*25 

7*33 x 10“*® 


other choices of / and 7 the corresponding values of a } b, / and g can be 
derived from equations (3.5). It will be observed in these equations that 
b is independent of the choice of / and r as also are gjf and 


*r 
1 sy pep 


< 3 -* *) 


For theoretical investigations at high frequencies (Chadwick and 
Sneddon 1958) it is desirable to choose 


T 



(3-13) 


as the unit of time, where 


V F ~ 



( 3 - 14 ) 




G. Bason and I. N. Sneddon, The Dynamic Stresses 


i go 

is the velocity of pure P -waves in the solid (Bullon 1947, p. 74). The unit 
of length can then be taken to be 



With this choice of fundamental units 


(3-iS) 


<* = 1 3 2 , f --= i, (3.16) 

/* P e 

where ]8 = v P jv g , the ratio of the velocity of pure P -waves to that of 5 -waves 
in the solid. Thus 



2(1 - v) 
x - 2V ’ 


( 3 -i 7 ) 


where v denotes Poisson’s ratio of the solid. The values of b are the 
same as those given in Table I and the value of g is merely that of gjf 
obtained from Table I. The values of e are unaltered. As we have said, 
this system of units is of value in some investigations but it should be 
borne in mind that w* is, for most materials, much higher than the highest 
frequency obtainable by using ultrasonic techniques and only about 
one-hundredth of the cut-off frequency a> c of the Debye spectrum (Brillouin 
1938, p. 324). For instance, for iron, co*-- 1*75 x io ia seer 1 while 
oo c =9-95 x I0 1S sec. -1 ; this means that the unit of length is 3*31 x icr 7 cm. 
so that the system is not of much interest in the discussion of engineering 
problems except possibly for very high frequency phenomena. For the 
values of these constants for aluminium, copper and lead the reader is 
referred to Table I of Chadwick and Sneddon (1958). 

In practical problems still another system of units may be employed. 
For instance, if we are considering the propagation of thermal stress in a 
plate of thickness I metre, it is desirable to take /—io 2 cm. and we may 
choose 


T“(io 2 /z*,) sec., (3.18) 

where the velocity of shear waves, v s is expressed in cm. per sec. With 
this choice of units we may write the equations of thermoelasticity in the 
forms 

< 3 -* 9 ) 

7 W= [(/ 3 2 - 2)^ + 2:y„„ (3.20) 

X(V 2 0 + 0 ) =*&+y 4 , ( 3 - 2 l) 




where 


Produced in Plastic Bodies by Uneven Heating 


IS* 


x= f-i =SKV -i I 0 -a } b= y — (3.22) 

/ 9 C H- 

all the physical quantities being measured in e.g.s. units. For the metals 
considered in Table I we get the values of x and t shown in Table II; 
in this system of units b has the same set of values as in Table I and the 
values of gjf can be obtained by dividing the fourth row of Table I by the 

Table II 

Aluminium Copper Iron Lead 

r(secs.) . 3-215 x io~» 4-654 x io~ 4 3-072x10-* 1-427 xio-” 

X • ■ 2750xio- 8 5-172x10-” 5-887x10-” 3-429x10-” 

third row. In calculating r and x in Table II we have again assumed that 

r= 2 9 3 °K. 


II. The Stresses Produced in an Infinite Elastic Solid 
by Uneven Heating 

4. The Solution of the Basic Equations 

We shall begin by considering the distribution of stress in an infinite 
elastic body containing heat sources, i.e. we shall consider the solution of 
equations (3.1)1 ( 3 - 2 ) and (. 3 - 3 ) for _c0 < (*i» x a ) < 00 and known 
functions ©. It will be assumed that the solid is free from body forces so 
that —o, (i~i, 2, 3) and that the temperature and all the components 
of stress and displacement vanish as xf + xl + x\ -*■ 00 or as t—► 00. 

To solve this set of partial differential equations we introduce the 
four-dimensional Fourier transform of each of the physical quantities 
occurring in the basic equations. The Fourier transform of a function 
f(x lt Xi, x s , t) is defined by the equation 

Aii, it, it, *>) “—; I /(*i> *2. *8 . £ ) ex P [i(xj,i v +ojt)]dx, (4.1) 
4 St JE, 

where dx—dx x dxtdx z dt and E t denotes the entire ^ar^if-space. If we 
multiply both sides of equations (3.1), (3.2) and (3.3) by exp [*(£*#*+ *>*)]■ 
and integrate over E t then, making use of the results 


(4.2) 



G. Eason and I. N, Sneddon, The Dynamic Stresses 


(Sneddon 1931, p. 27) we find that these partial differential equations are 
equivalent to the set of algebraic equations 

-au>H v , (4.3) 

fjjg 55 ~~ ” ~'')Tr^r (4»4) 

- + © = - iftoO (4.5) 

where =£<,£„ from which wc. may obtain expressions for the Fourier 
transforms of the temperature and of the components of the stress tensor 
and the displacement vector in terms of the Fourier transform of the source 
function 0. 

From equation (4.5) we find that 

fl Ba gg! & + —®- (4.6) 

If we substitute from this equation into equation (4.4) and insert the 
resulting expression for r m into equation (4.3) we obtain the equation 


(| 2 -«cu 2 )w 2J + |(/3 2 -i) 

which may be solved to give 


Z*-i/oj 




t , . iH fi 


p ~ (g* - - ato*) - v * u 

Substituting this expression back in equation (4.6) wc find that 5 is given 
by the equation 

,_(j8*?»-W)0 

~ (? - 

which may be written in the form 

g p _ ih<Dc?§ _ 

= (? - ifo>)[<& - iM{^-acZ) - ibgjEfi W ' 

where 

denotes the Fourier transform of the “classical” solution. Substituting 
from equations (4.7) and (4.8) into (4.4) we obtain the expression 

’■ + —— (4 - ,0) 





Produced in Elastic Bodies by Uneven Heating 


153 


for the Fourier transforms of the components of the stress tensor. It is 
also easily shown from equation (4.7) that the Fourier transform of the 
dilatation is given by the equation 

j b®^ 

' (£“ - ifcoXP'E? - aw 2 ) - ibgw \ 2 (4-11} 

Once the Fourier transform of a quantity has been determined the 
quantity itself may be calculated by means of Fourier's theorem 

*/* £2? ^3? +<&tj\dW (4.12) 

4 ‘ 5 T' W J Vyi 

(Sneddon 1951, p. 45), where dW=d^ 1 d^ i d^dw and W 4 is the entire 
pace. In this way we obtain the equation 

b f i j v ® exp [-/(i 7 t x, p + w/)]dW 
Up " 4 ^ J tn (£* - ifo>WV - aoP)~ibgo£? U ' I3j 

by means of which we may calculate the components of the displacement 
vector, and the equation 


b r r(i 8 a 4- 2 ~ lodged 2 - i/op'S^c' 


4 tr 


f M 
J W, 


(5 2 ~ - aw 2 ') - ibgoX 2 


dW (4.14) 


which determines the components of the stress tensor. The temperature 
within the solid is given by the equation 

ioX?® exp [-id^+ojtWW 


0 _ -= f ---————-, 

47T a J w t [(? 2 - - aoP) - ibgwlE?]^ - ifa>) 

where the “classical” temperature 6 C has the form 


8 ° 


J, 


© 


47T 2 jw.% 2 - ifw 


exp [ - i(i v x v + wt)]dW. 


( 4 -iS) 


(4.16) 


5. The Steady State Solution 

We shall consider the case in which the heat in the solid is generated at 
a constant rate so that the basic equations are each independent of the 
time variable t. The equations of equilibrium can be solved by the method 
of the previous section but it is a simple matter to derive the equilibrium 
solutions from the general equations of the previous section. When the 
source function © is independent of the time we may write 

©(*1, #2) t) — X%, X$), (5.1) 




G. Eason and I. N. Sneddon, The Dynamic Stresses 


*54 


so that. 

©(&, io> id)=( 2 rr) i iXii, £u> QS(co), (5.2) 

where 8(to) is the Dirac delta function of argument u> and $ is the three- 
dimensional Fourier transform of & defined by the equation 


# =~» [ # exp (1 it; v X u )d V (S-.t) 

(27r) s J/4 

in which dV ~dx x dx 2 dx 3 and i? 3 denotes the whole ^i^a^a" s P {u -e. Sub¬ 
stituting from equation (5.2) into equation (4.13) we find that, in the case 
of steady heat flow, the components of the displacement vector are given 
by the equation 

u » = exp _ *&»)**%• (S- 4 ) 

where d%=dg 1 d£ z dg a and W s is the entire fi^s-space. Similarly from 
equation (4.14) we obtain the equation 

Ts>a = ’ j8 2 (awjS Iw, cxp ^(5-5) 

As an example of the use of these equations, we shall consider the case 
in which the only source of heat in the solid is a point source at the origin 
which is generating heat at a constant rate. In these circumstances we 
may write 

& - &„ 8 (.%-] ) 8 (x 2 ) 3 (# 3 ), 

where & 0 is a constant. It follows immediately from the definition of the 
Dirac delta function that 


Substituting from equation (5.6) into equation (5.4) we find that 


6& 0 _S/ 
'&v*p*'dxj 


where 



( 5 . 6 ) 


It is a simple matter to show that 

i&o /xA 

& 7 ) 

where r —x^-j-x^y x\. Expressions for the stress components may now 



Produced in Elastic Bodies by Uneven Heating 155 


be obtained by differentiating this equation and so a full description can 
be given of the distribution of stress in the solid. 

The solution of the general problem may be obtained from equation (5.7) 
by the principle of superposition. We find that 


* 9 


(x p ~Xp)&(x[, * 8 , * 3 ) 


R 


dV, 


where dV ~dx' x dx'«dx ' 3 and E ' 3 denotes the whole ^ayt^-space. 
distance R is defined by the equation 

IP «(*! - *,')* + (* a - #£)* + - x 3 )\ 


(S-8) 
The 

( 5 - 9 ) 


The result (5.8) could also have been obtained from equation (5.4) by 
means of the convolution theorem for three-dimensional Fourier transforms. 


6. The Two-dimensional Problem 

The solution appropriate to a two-dimensional problem of plane strain 
in which the displacement vector at the point (x t , x 2 ) has components 
(u lt %) and the state of stress is uniquely determined by the three com¬ 
ponents r y/ (y, y' = I, 2), may be obtained from equation (4.13) and (4.14) 
by assuming that 0 is a function of x v x 2 and t only. We then find that 
the © occurring in these equations should be replaced by the expression 

(aw)» 8(&®, (6.1) 

where 0 (f x , £ a , u>) is now defined to be the three-dimensional Fourier 
transform of the source function &(x x , x 2 , t), so that 

©<&, U «) ■J <0(* x> ** X 3; (6.2) 

where ds=dx l dx 2 dt and S 3 is the entire .aq^-space. Inserting the 
expression (6.1) into equation (4.13) we obtain the equation 

_ »’£,©(&, aH exp [- ! '(x 1 i l +x 2 j., + Qji)]dT 

Uy ~ (2ir) a J T, (il + £j- if (»)[(£* + !s)£ a - aoP] -ibgca{(;\ + £$)’ 

where y = i, 2, dT—d^d^doj and T 3 is the entire ^ a <y-space. 

■ In a similar way we can show that the components of stress are given 
by the tensor equation 

£ f [QB» - 2_Ki! + iHy + 2&.S - iu>bg(g + mi +jf 
"(27rj9jr, (ii + la - iMmtl + £l) ~ ctoP\ - icobgOf + {%) 

bS^C Qe^dT 

(zTrjtjr.gi + £i-ifco’ 

where y, y' — x, 2, and x— x i£i + x z£z + 



156 G. Eason and I. N. Sneddon, The Dynamic Stresses 


7. Axially Symmetrical Problems 

If we choose the axis of symmetry of the problems to be the s-axis then 
we need only consider the equations (3.6) to (3.11). Substituting from 
equations (3.6), (3.7) and (3.8) into equations (3.9) and (3.10) we find that, 
in the absence of body forces 


|8 2 ( ; 


dhi 1 du u 
'dr i+ ~r~ 8 r 


rV dP KP 1 


W d-u 
b 

c)rdz dr <)r 


d l w 1 3 w d [du u 


”) 


c )0 <>w 

r)z 


< 7-0 

(7.2) 


To solve these equations we introduce the transforms 

f j exp {z(Xz + <x>i)}dzdt f rj x {£r)nd>\ 

2 TT J ~oc J - 00 Jo 

(w, 0) =-— [ f exp {/(£z ■\-<j}t)}dzdt\ r/ 0 (^r)(w, 6 )dr. 

27 T J~oo J ~oo JO 


(7-3) 

(7-4) 


If we multiply both sides of equations (7,1) by (ytrr )- 1 exp {/(&?-)■«»/)} 
x.rfi(£r) and both sides of equation (7.2) by (2d) ' l exp {1(1,2 + (otyr/^gr) 
and, in both cases, integrate over the whole raf-spaee then, making use of 
the results 

J. r (fr ^ r^J x (ir)dr, 

(Sneddon 1951, p. 61), we find that this pair of partial differential equations 
is equivalent to the pair of algebraic equations 


(j8 2 £ a + £ a - aoP)u - i(^ - 1)£££ 
i(l& - i)££w + (£ s + j8 2 £ 2 ~aoP)w =*ib£8. 
Solving these equations we find that 


(«, s») = 


m, *0 

W + £ 2 )-ao» 2 ‘ 


(7-5) 



Produced in Elastic Bodies by Uneven Heating 


If wc now multiply both sides of equation (3,11) by 
(27r )" 1 exp [*X£»+« 0 >/ 0 (fr) 
and integrate over the whole raf~space, we find that 
(£ 2 + £ 2 - */co)d - i(x>g(l;u - 2‘£w) = ©, 


where 


0 ——f ( exp [i(&+a)t)]dzdt[ r&/ 0 (^r)dr. 

27 rJ-oo J~co Jo 


Solving the algebraic equations (7.5) and (7.6) for the unknowns u, w, 6 in 
terms of the known quantity G we find the expressions 


H ~iK ' PK? + & - [jS 2 (£ 2 + £ 2 ) - «co 2 ](f + £ 2 - */co) - Mf(? + £ 2 ) 

for the transforms of the components of the displacement vector and of the 
temperature 6 . If we invert the equations (7.8) by the appropriate 
theorems for Fourier and Hankel transforms (Sneddon 1951. pp. 44 and 52) 
we find for the components of the displacement vector 

b r r r 

■"a J..“ p ik(, 1 , »> ' ’ <7 ' 9> 


Sr r r . .A-,,*, 


and for the temperature variation 

0«*~f f exp [~i(&+<ji>()]d£dw[ 
27 T J~oo J-eo J( 


^[/ 3 2 (f+ g 2 )-^]G/ 0 (^ 


where 

Z>(£, £, to) - [jS 2 (£ 2 + £*) - «o 2 ](f + T - */to) - *to W + £*). (7.12) 

In the steady state case 

0 =#(r, 0) 

so that 

G»(2ir) l #(£, 0 §(tv)- ( 7 - I 3 ) 

Substituting from equation (7.13) into equations (7.11) and (7.12) we 
obtain the steady state solutions 

b r ■ 


* f -^i-r gng; P Tf t sm 


(27r)*j3 2 J"« Jo(£ 2 -+ 

P.R.S.E.—VOL. LXV, A, 19,5 8 - 5 9> PART XI 


II 



158 G. Eason and I. N. Sneddon, The Dynamic Stresses 




m, 0 


8. The Quasi-static Solution 

If we consider problems in which the c.g.s. system of units provides 
natural units of length and time, i.e. if the centimetre is the typical dist ance 
and the second the typical time then it is obvious from Table I that the 
constant a is very much smaller than the other constants g and / occurring 
in the dimensionless equations (3.1) to (3.3). It follows, by expanding the 
integrand in equation (4.13) in ascending powers of a , that the approximate 
solution 

& f ij p & exp [-i(g 3 >x p * 4 - Q)t )]dW 
4ir 2 p 2 ]w, § 2 (? 2 -2/iCo) 

al> f ex p \ -i{i tl x v +(.ot)\dW 

4tt 2 1 8 4 J w , | 4 (| a - //!<«)“ " 1 


/i^/(i+e) (8,2) 

will give a very accurate description of the displacement field in the elastic 
body. Because a is so very small we may in most cases take it to be zero 
and describe the displacement field by the quasi-static solution 

(0) _ b f exp \ 

? 4 v 2 iS 2 V, §•(?»-&») .. (/ = I,2,3 )‘ (8 * 3) 

Similarly equations (4.14) and (4.15) may be approximated to by the 
equations 


+ffiSpa - 4 ^ 2 ^ 


[(ft 2 ~ 2 )^ 2 ^ g + -iubgSfQS? -«/«)- 1 S !) j]©e"' ftt ! |a, f + 


d (0) 0* = J?g f wg a © exp + <r>t)\dW 

4 >ft a J wl (§ 2 -^w)(f a -* 7 «) ’ (8 ‘ S) 

where the “classical temperature” & has the form (4.16). 

The corresponding approximation to the exact solution (6,3) of the 
two-dimensional problem is given by 

.m j f fgy® ex P [ ~ K*itx + **£* + adftdT . . 

y (2#Jr. «!+*D<fS+*S-0>) ’ (y ” 1 ’ 2)) (8,6) 


dW, (8.4) 


>._t_f f 

(277) Jr, 


(y»i, 2,) (8.6) 





Produced in Elastic Bodies by Uneven Heating 159 

where, now, © is defined by equation (6.2). For this solution the stress 
components are given by the equation 

bSyy> (• Q exp [ - i(x T ii + + <ot)]dT 

£‘i + fs-(fa 

b r [(/ - 3 )(fl+ |g)jv+- uobgcg +@(gj+g -ggry 

x® exp [ - z'Oiii + + cot)]dT. 

The quasi-static solution of the axially symmetrical problem is found 
by putting a ~o in equations (7.9) to (7.12). It is given by the equations 


h f°° 

0 

^ Retool 

r ?®fi (fryf 

(8.8) 

2 rrp 2 J“<» 

^le+mte-+?-iAcd)‘ 

8 

fra j 

1 

0 

t 

f e-^+^V^w 1 

J-orj 


(8.9) 

277 */ 3 2 


277 * <* 

r r 

Lao J -00 Jo 

(8.xo) 


The quantity © occurring in these equations is defined by equation (7.7). 
It will be observed that in this case the temperature variation 6 has the 
same form as it has when it is governed by the simple equation for the 
conduction of heat; the only difference is that the diffusion parameter/is 
replaced by the parameter f x defined by equation (8.2). 


9. Solutions of Special Problems 

In this section we shall discuss the application of these general formulae 
to the solution of certain special problems. 


(i) The Stress due to a Periodic Line Source 

We shall begin by considering the stress distribution arising from a line 
source of periodic strength which lies along the # 3 -axis. If the source is 
of frequency O 0 then we have a two-dimensional problem in which 

(9-1) 

where, in the units of § 3, Q=£S 0 r and F is a constant. As a result of a 
simple integration we see that 


© 


F 

(27r)* 


S(cu+Q). 


(9.2) 



i6o 


G. Bason and I, N. Sneddon, The Dynamic Stresses 


Substituting from equation (9.2) into equation (6.3) and performing the 
co-integration we find that 


v 47 tV dr 


( 9 - 3 ) 


where r*=xi + x* and 


r _r r 

W U* + «/Q)(j8 2 p 2 - a£l 2 ) + ibgilp* 

If pi and p| are the roots of the quadratic equation 


p.h(pr)dp 



p 4 + («Q 2 /j8 2 - zfjil)p 2 - ia/OP/fi* ~ 0 

( 9 - 4 ) 

in p 2 then 


bFx v a tat r 
“ v »rj 3 V 25 

( 9 - 5 ) 

where 





f r P vM)d P 

( 9 - 6 ) 


8 W+p|)(p s +p!) 


-j 

Pa“Pi 

f rv/iW rpVi^pi 
[Jo p S 4 “Px Jo /> 2 +/4 j 

( 9 - 7 ) 

Similarly if we put 





0-—8(r)e <<M , 

(9.1a) 

that is 


© «=7 7 8(a> +£1)8(£) 

(9.2a) 

in (7,9) we find that w- 0 and 




6 Fc mt r 

(9-Sa) 


in agreement with equation (9.6). 

Using a well-known formula in the theory of Bessel functions (Watson 
1944 ) P- 434 ) we find that 


CpVxCpr). ,„, is 

io~7T¥ dp =** l(<kr) ’ 


(9.8) 



Produced in Elastic Bodies by Uneven Heating 


where K v (i) denotes the modified Bessel function of the second kind of 
order v and argument z and, in terms of the first Hankel function, 

(Watson 1944, p. 78). Inserting this expression for the integrals occurring 
in equation (9.7) we obtain the exact solution 


« = ••• . jf tpi A\(pir) - PzA\(p 2 r)}. ( 9 . 9 ) 

^p(Pi-Pi) 

The temperature variation in the solid is readily found from equation 
(7.11). We find that 




pl+aQ?/l 8 3 


where 


27T PrPs Pl"~p2 ) 




JO p*+pi 

(Watson 1944, p. 434). Hence we find that 


I'e int (pl+aW/pz 


27T 1 Pi-Pa 


pi + ail 2 IP* 


Trir™ v 


The quasi-static solution obtained by inserting the value of 0 given 
by equation (9.2), (or (9.2a)), into equation (8.6), (or (8.8)), may be put in 
the form 

CO) bFa ia % 

277 jS 2 

where 

T r JM)d P 

4 Jo f+iffii 

Making use of equation (9.8) and a well-known result in the theory of 
Bessel functions (Watson 1944, p. 391) we find that 




where 




r^r/lil*. 



G. Eason and L N, Sneddon , The Dynamic Stresses 


162 

Hence we have 




jFb* 




27 r/J Hf£lr 


{r^KJr^) - t}. 


( 9 -ls) 


Furthermore, if we substitute from equation. (9.2a) into equation (8.10) 
we find that the quasi-static value of the temperature is 

go; eum d i. 

277 Jt) £* + */ x £I 

Evaluating by the same formula as before (Watson 1944, p. 434) wc find 
that 


/,'jat 

0 Co >=--AW 1 **). 


27 T 


(9-13) 


Using a known relation for modified Bessel functions of the second kind 
(Watson 1944, p. 80) we write this equation in the form 


J 7 e m 

0 (O) =-—«W-WW). 


277 


From equations (9.10) and (9.13) we find that 

+ + aLm\A' 0 (p, i r)-(p\ -pj) 

^ C0) (pl-^jA' 0 (r 0 e^f . 

For small values of the parameter («£ 2 // 8 2 /) we can show that 


(9-14) 


( 9 . 15 ) 


ft-f/jCO*®*"* 

from which it follows that 

pl + atf/p 2 


1 + 


zeal! 




**0 

T 


2 

Pi~Pz 




ieaCl pi -f aQr/fi* hail 


AF 


2 2 
Pi“ Pa 


/xjP 


and 


A L r-r 0 e <a * +, M p^r^r^, ijj^ea&jsf^, 
where r 0 is defined by equation (9.11) and r x is defined by the equation 




r 


(9.16) 


It should be observed that 




Produced in Elastic Bodies by Uneven Heating 163 


so that, if < 1, it follows that r x < r 0 . Using the relation 

H 0 (ze i,mr ) = jT 0 («) - 

(Watson 1944, p. 80) we find from equation (9.15) that 

e ~ 0 (rt) ( W - JUri - Wrl - MW - M 

PA 1 Kfr 0 )-\mlfr,) j 


(9-U) 


If we use the asymptotic expansions of the modified Bessel functions 
(Watson 1944, p. 202) we find that for large values of r 


where 


But r x 


0 W -f x ^ r " ri) ’ 




r„ so that 


0-O ((t > ~eaCl( 4 \* 
0<°> ~f 1 p\ + rd) 


(9.18) 


showing that for small values of (aQ,/f ,j8 2 ) the quasi-static value of the 
temperature is a very good approximation to the exact value. 


(ii) The, Effect of a Moving Line Source 

We shall consider now the effect of a line source of heat of constant 
strength which is moving with uniform velocity V in a direction perpen¬ 
dicular to its own length. If the source remains parallel to the # 3 -axis 
and if its velocity is along the x x -axis we may write, in the notation of 
equation (6.2) 

© -P 8 (x^S(jx x -pt), (9.19) 

where 

p — Vrjl. (9.20) 

For this function 

©(In I2, w) = ”i S ( w + /li) 


so that from equation (8.6) we obtain the quasi-static solution 


b£ f” C x f| y exp [ - i{(x x -pt)U dP 

47T 2 ^ 2 J-«J-« (ll + lD(ll + la + z /l/lx) 



164 <?. Eason and /. N* Sneddon , 7 Y/tf Dynamic Stresses 

For this solution the components of stress rfj, T aa , t\'!J are given by the 
equation 

jF< 5 S„ 


r* r cx p [-*{(*i“^)^i + **^}] t t .it 

^ J-«J-® ' * Cl Si 


bF r°° f“ C(j3 B - 2)(ll + fl)8yy> + +M& €i(ii + @(lt I- I ‘_V)J 

47T 2 /3 2 J- 


j: 


Xexp [-*{(*! 

For example, we find that 

bF r® r« & | a exp [ - /{(.^ -pi) £, + .Va&lVli r/£ a 


(9.! 


»©■ 


27 T 2 / 3 ! 


f oo roo 

-00 J-c. 


(£1 + £a)(li + fs +</it/li) 


(9,22) 


Performing the integration with respect to £ t we find that 
f" && ex P C - *'{(*1 ~pt)£i + 




(g+mi+&+ifip£i) 
m ?l f ( —_ c ~*^‘_ 0 -f.(n -?«> 1 sin ,fi ) /i 

AP Jo 1(4 £1+/)* I 

and it can be shown that this integral has the value 

2 


2ir*a 

ApR 


AP 




t » 


where R is defined by the equation 

F i ^{x x ~pt) t +x\. 

Substituting from equation (9.23) into equation (9.22) we find that 




bx % F 


AP 


e -»A 


In a similar way we can establish the result 
r® c® e -<(»i -rth-tettt 


(9.33) 

(9-24) 

( 9 -as) 


n w C» 


e -K* 1 -jtt>K 4 ft+/V> i fM 


■ - Ufe -» ( 9>3 6) 


from which we obtain the equations 


•ff+iff- 


bF_ 

fjB*-* 

Tt 

l /S® 


(Sk.7) 




Produced in Elastic Bodies by Uneven Heating 


165 


( 0 ) ( 0 ) 


JlP P 


by means of which we may calculate the stress components rfi and riV 


(9.28) 

.CO) 


(iii) The Stress due to an Impulsive Line Source 

We shall now consider the effect of an impulsive line source of strength F 
applied along the line x 1 =x 2 =0. This may be represented by 

0 (x v /)=/'S(x 1 )S(x 2 )S(/) 

from which, in the notation of equation (6.2), 


0 = 


F 


(9.29) 


Substituting from equation (9.29) into equation (6.3) we obtain the solution 
IF f y exp [ - + jgfj + 

~ Stt 3 J p, / 3 2 (| 2 +1|) 2 - (/w/uS 1 + <*a> 2 )(|| ++ iafof 

Alternatively if we use cylindrical co-ordinates then in the notation 
of § 7» 

©= — §(/) 8(/) 

271/ 


so that by equation (7.7.) 

q mo , 

27r 


Inserting this expression in equations (7.9), (7.10) and (7. x 1) we find that 
w~-o and that u and 6 are given by the equations 


where 



D - - aw 2 )(| 2 - - zo >^ 2 - 


(9-3°) 

(9-3t) 

( 9 - 33 ) 


If we assume that a=o we find that equation (9.30) reduces to 





166 


G. Eason and I. N. Sneddon, The Dynamtc Stresses 


« (0 H bF 


' o A < o, 

A>0> 

V' 

so that 

( o t < o, 

, >0 . 

Similarly, from equation (9.31) we find that approximately 

l o t < o, 


Since we have assumed that <2 =o we should expect these solutions to be 
valid if r < zyf (in conventional units). For very short times, i.e. immedi¬ 
ately after the application of the thermal impulse these expressions would 
not be valid. 

The corresponding components of stress for t > rjv H are 


0?> = 

bF 

-e" 


„(0)_ 

bF 

\f^ 


<7* - 

27T^ 2 fl r% 

\ t 



bF P -Ar‘j 4 t 


z 

27rp 2 t 

j 



J, 


and the shearing stress r rz is identically zero. 

(iv) Impulsive Point Source of Heat 

To illustrate further the use of the axially symmetrical solution we shall 
consider the effect of an impulsive point source of strength q situated at 
the origin. For such a source we have 


©=— B(r) 8 (z) 8 (t) 

277T 


so that, in the notation of equation (7.7), 


( 9 - 33 ) 



Produced in Elastic Bodies by Uneven Heating 167 


Substituting from equation (9.33) into equation (8.8) we obtain the approxi¬ 
mate solution, valid for all but very large values of r/t } 


bq 


. rr.-^r__ 

47 T/ 3 2 J-«J—«> Jo (£ 2 -h £ 2 )(| 2 + l % -ifiU>) 

Performing the integration with respect to to we find that 


and then performing the integration with respect to £ (Erdelyi et aL 1954, 
p. 15), we obtain for the radial component of the displacement the 
expression 


* <0)= mr) { e " - 5) + + Btft k 


(H 
yft 



from which numerical values may be obtained by quadratures. 
Similarly it can be shown that 



III. The Stresses Produced in a Semi-infinite Solid by 
Uneven Heating of the surface 

10. The Solution of the Basic Equations 

We shall now consider the case in which the solid is bounded by a 
plane which is free from applied stress but whose surface temperature is 
made to vary in a prescribed way and which has heat sources of known 
strength in the interior. 

In the case of a semi-infinite solid the symmetry of the equations, when 
written in cartesian co-ordinates, is not preserved, so that it is no longer 
advantageous to make use of the notation xfi — 1 , 2, 3) employed in 
Part II. In this section we shall denote the co-ordinates of a typical 
point of the solid by (x } y, 2) and assume that the solid is bounded by the 
plane 2=0, occupying the space 2 > o. If we denote the components of 
the displacement vector by ( [u , v, w) and those of the stress tensor by 
0-3, ov, cr z , r y9 , r ZXi r xy we may write equations (3.1) in the forms 



i68 


C. Eason and /. N. Sneddon, The Dynamic Stresses 


fajy Soy | 
dx + oy dz of- 

dr as (fry, doz^ &w 
dx 8y dz ° dfi 


(10.2) 


(10.3) 


(in the absence of body forces). The stress-strain relations (3.2) take the 
forms 


du dv dw 

dv dw 
rvz ~Tz + ~ty’ 
dw du 

T -“to + a? 

du dv 

r ™ = Ty + Tx 


(10.4) 

(io-S) 

(10.6) 

(10.7) 
(xo.8) 
(10.9) 


•where the temperature variation 9 satisfies the equation 
d*9 8*8 8*8 ^ 89 d/du dv 8w\ 

8x* + dy i + dz 2+ * ~^Tt +g 'dk8x + '9y + ~dz)‘ 

If we introduce the sets of transforms 
( 5 m d y , d„ r xv , u, v, 8 ) 

= 7 ^\~<>>\~^\~y ax+ ’' v+u ‘ t)dxdydt \ 0 ^ °*» cr * ! ’ r<ev ’ u> v ’ ^ sin ^ zdz > C 10 - 11 ) 

J pc© pco poo poo 

Tvzj w) “"^2 J oo J op] ^^ XJrnyJrU ^dxdydi J^(r £BJg , r yz3 w) cos £zdz (10.12) 

then multiplying equations (10.1) and (10.2) by exp [i(£x + vjy + <ot)] sin £z 
and equation (xo.3) by exp [i(£x + r}y + <ot)\ cos (£2) and integrating over 
all t throughout the half space and assuming that 


we find that 


0^ = 0 when 2 = 0 

*£°x + iyfxv + £r xz =aco 2 u } 
z£f X y + irjdy + Xjr yz = aoj% 
i£T xz + ir]r vz -f £a z ~ aa) 2 w. 


(10.13) 

(10.14) 



Produced in Elastic Bodies by Uneven Heating 


169 


Similarly if we assume that 


u =v = o when z = o 


(10.15) 


we find that the partial differential equations (10.4) to (10.9) are equivalent 
to the following set of six algebraic equations 


= - ij3 2 £u - iQ 8 2 - 2 )rjv - (/3 2 - 2 ) - b§, 

Gy = - iQ 8 2 - 2 ) - zj3 2 rjV - (j8 2 - 2 ) - 35, 

- z(/? 2 -2)£u- z(j3 2 - 2 ) r)V- j3 2 £w-b§, 
Tvz^iZv-irjw), 

t xv = -iQrju + £#). 


(10.16) 


With the same assumptions about the surface values of u and v we find on 
multiplying both sides of equation (10.10) by exp [i(£x 4* rjy + cot)] sin £# 
and integrating that 


where 


+ + -ifcod -gga)U-~g 7 } 0 t)V + iga)[ ) w } (10.17) 


s~r r r e ^ + ^ + “w^r© S m^) 

27 T 3 J-00J-00 J-00 JO 

n oo /*00 

tffwQdxdydt 

-ooj-co / 


(10.18) 


in which 0 O denotes the surface temperature. 

Solving the set of algebraic equations (10.14), (10.16) and (10.17) we 
find that the transforms of the components of the displacement vector are 
given by the equations 


(u, v, ») = (*£, 

where 

Q _ Q + flo _ 

(y 2 - 2/co)(jS 2 y 2 ~ aca 2 ) - ibgcoy 2 


(10.19) 


(10.20) 


with y 2 =P+^ 2 + ^ 2 . Inverting equations (10.19) by means of Fourier’s 
theorem for multiple transforms (Sneddon 1951, p. 45) we find that.the 
components of the displacement vector are given by the equations 

u=—S f f iie-tf+v+^didnda f<? sin (£*)<*£, (xo.21) 

27 T 2 J-oo J-ooJ-oo Jq 



170 G. Eason and L N. Sneddon , The Dynamic Stresses 


z> = — r f f°° f 5 sin 

277 2 J-°o J-co J-co Jo 

(10.22) 

27 T 2 J-ooJ-ooJ-co JO 

(10.23) 

which can be written in the form 

( u , z/ } ze/) = - grad tp, 

S-f 

p 

w 

with 

7 /tQO /tOO /*00 /tQO 

ffsinM. 

27 T 2 J-°°J~co J-00 Jo 

(10.2s) 


The equations (10.24) a nd (10.25) constitute the solution of the problem 
in which there is a source function 0 in the solid and a surface distribution 
of temperature 6 0 and the mechanical boundary conditions are defined by 
equation (10.13) a nd (10.15). For this solution we find that when #=o 
the shearing stresses r XSf r vz are given by the equations 

(O.-o-fcfo* ^=^\_ ao \_ x \_J&~ iax+ ' ,y+ “ t)d & r l^\ 0 COdC, ( 10 . 96 ) 

(r».).-o-fr(*. y, t) £<?</£. (10.27) 

If, therefore, we wish to find a solution to the problem in which 

o’ z ~T a3 g — o on z — o 

we have to add to the.solution (10.24) the solution of the (purely elastic) 
boundary value problem 


°* = 0 > T xz= T vz = -q 2 on 0 = 0, (10.28) 

where q 2 a re defined by equations (10.26) and (10.27) respectively. 

The solution of the boundary value problem corresponding to the conditions 
(10.28) has been derived previously (Eason, 1954). It should be noted 
that for the functions involved in this particular form of the problem 

r])J, (10.29) 


where 


V> oS)=zb 



o)dC 

V - if «) 03 V - aco 2 ) - ibg(x>y i 


(10.30) 



Produced in Elastic Bodies by Uneven Heating 


171 


11. The Steady State Solution 

We shall now consider the case in which the surface temperature does 
not vary with time and the rate at which heat is generated in the solid is 
constant so that each of the basic equations is independent of the time 
variable t . If 


%=q(x,y, z), O 0 =& 0 (x,y) 

then, in the notation of equations (10.18), 




0 — (277)^8(00) 

where 

^ ( 27 T 3 )J 

[ e^ x+v,y ^dxdy f q sin X^zdz 

~oo J -00 Jo 

and 


d = 2Z? 0 8(o>) 

where 

^0 = 

— [ f ^ x+ ^dxdy. 

277 J-co J-00 


<»■*) 


(11.2) 


(”• 3 ) 


Therefore if we write 

2^ 0 + (27T)V , N 

H = - - - (11.4) 

y 4 

we find that 

G=m(co). (xx.5) 

Substituting from equation (ii.j) into equation (10.24) and (10.25) we 
obtain the steady state solution 

(u } v , w ])~ - grad X (11.6) 

where 

x= i sin ( ix - 7 ) 

with R defined by equations (11.2), (11.3) and (11.4). This solution 
corresponds to the boundary conditions 

cr z = u—v — o on z= o. 

To obtain a solution to the problem in which 

<r z =T xz —T yz = o on s = o 



172 G. Eason and I. N. Sneddon, The Dynamic Stresses 

we have to add to (11.6) the solution of the statical boundary value problem 
of pure elasticity in which 

a„ = o, t**=-•%, Ty Z ~ -s 2 on 2 = 0, (xx.8) 

where s lt s 2 are defined by the pair of equations 

Oi, s 2 ) IRdl. (11.9) 

The solution of this boundary value problem is well-known (Sneddon 1951, 
p. 444), so that the complete solution of the steady state thermoelastic 
problem can be derived. 


12. The Solution of the Two-dimensional Problem 

We may obtain the solution of the two-dimensional thermoelastic 
equations for a half-space by substituting in equations (10.21) and (10.23) 
the expressions 

0 = (273r)V(£, i, <0)8(1?) 8 0 = m <o) 8(ij), 

where 



£’")- ( 27 r 3 )ij. 

[ f ©(#, z , /) sin (£z)dz, 

-ocj-oo Jo 

(12.1) 

and 


— f f 6(x, 0 } t)e l ^ x+,a ^dxdt. 

27 T J -00 J-oo 

(12.2) 

In this 

way we obtain the solution 



r r 

27T J-ooJ-o 

sin (&)d£, 

9 JO 

(12-3) 


b f 00 

e-^^do IF cos (&)dt, 

( 12 . 4 ) 

where 

% 

(2 dfq + 

(I2-S) 


(l 2 + £ 2 - W + £ 2 £ 2 - a <o a ) - %<o(f + £ 2 )' 


It is immediately obvious that this solution can be written in the form 



Produced in Elastic Bodies by Uneven Heating 


173 


with 


h poo pco poo 

=— f f e-^ + “ ( V^J iFdl (12.7) 

277 J—00 J —qo JO 

For this solution we find that when 2—0, a z ~o and 

A pOQ pCO p» 

t..-— I mt 

277 J- 00 J -00 JO 

so that to obtain the solution corresponding to a boundary free from applied 
stress we must add to the solution (12.6) that solution of the dynamical 
equations which gives for 2 = 0 

d z = 0 } T xz = - 2T, (12.8) 

where 

T = — f f K t *s)*=o Z KIX+Ut) dxdt=— f IFdl. (12.9) 

277 J-coJ -00 77 Jo 

The solution of this latter problem is (cf. Eason, 1954) 

-t r r **- w *{i+os®-1) 1 £ 1 (12.11) 

277(p“ — I) J-oo J-co 

Hence the complete solution to our two-dimensional problem is contained 
in the equations 

.-AT r r^sm (&)di 

277* J—ooj —00 JO 

+ vl , f f /|-,^ 2 - 0 32 - 0 m ^}e- r (i 2 .i2) 

277“(p* J - l) J-qo J-co J § j JO 

•co poo poo 

e -i(f "+-*>d£th>\ IF cos (iz)di 
-00 J-CO Jo 

— r r - 1) 1 * i *}e - r sm. (i2.x 3 ) 

277“(p* — I) J-QO J -00 J 0 

On the boundary 2—0 we find that the normal displacement is 

P.R.S.E,—VOL. LXV, A, 1958-59, PART II 



12 



174 


G. Eason and 1 \ N. Sneddon , The Dynamic Stresses 


13. The Effect of a Periodic Line Temperature Applied to the Surface 

To illustrate the use of the above formulae we shall consider the effect 
of a line distribution of temperature of magnitude P which is periodic in 
time applied along the line y—z ^0 which may be represented by the 
formula 


6 (x, o, t) =jP 8 (x)e isu , (13.x) 

so that from equation (12.2) 

m, w)=/ > 8(w+f2), 

and from equation (12.5) 

zvt r \_ -P£8(«+£2) 

m, l, «) + + (' 3 -z) 

where pi and pi are the roots of the quadratic equation (9.4). For this 
function 




7tP8(cO + Q) 
z(Pi-pl)P 2 


ae+pty-ie+piA 


£ era, e, «) c 0S +&*-<*+**% 

]>, C, «) sin 

J ° 2 (Pl-P 2 )P 


Substituting from these expressions into equations (12.12) and (12.13) we 
obtain the expressions 


« 


bPt iat 

™(pl~pW 





_ e -«’+pi) i «] s j n ax)dt- 


bPe iat 


1) 


£[^-(^-i)^][(|2 +p S)i_(p +p !)i ]e -t* sin ( ^ )4j (I3 . 3) 


^Pe* 0 * r» , 

‘ -rteTSpl K co. <*« 

^e iot r 

"^ (pf-pDAjS 8 -!) Jo [l + ( ^~ + cos (x&i. ( 13 . 4 ) 



Produced in Elastic Bodies by Uneven Heating 175 

Now it is readily shown from known results (Erdelyi 1954, PP- *6 and 75) 
that 

£ sin (gx)d$-4 *i0v)], 

| 0 [(£ 2+ Pi) ie ~ tta+p!)i *-(£ 8 + P2) i e -(P+psi *] cos (gx)d£ 

A[pt*M -plA\( Pl r)] 

where r 2 =^ 2 + ^ 2 . The remaining integrals in equations (13.3) and (13.4) 
are in a form suitable for treatment by a method similar to that used by 
Lamb (1904). 


IV. References to Literature 

Biot, M. A., 1956. “Thermoelasticity and Irreversible Thermodynamics ”, 
/. Appl. Phys ., 27, 240. 

Brillouin, L., 1938. Tenseurs e?i Mecanique et en Pllasticite. Paris: Masson et 
Cie. 

Bullen, K. E., 1947. An Introduction to the Theory of Seismology. Cambridge 
University Press. 

Carslaw, H. S., and Jaeger, J. C., 1947. The Conduction of Heat in Solids. 
Oxford University Press. 

Chadwick, P., and Sneddon, I. N., 1958. “Plane Waves in an Elastic Solid 
Conducting Heat”,/. Mech. Phys , Solids , 6, 223. 

Duhamel, J. M. C., 1837. “Second Memoirs sur les Phenomenes Thermo- 
M^canique ”,/. £c. Polyt. Paris , 15, 1. 

Eason, G., 1954. Ph.D. Thesis, University of Birmingham. 

- , Fulton, J., and Sneddon, I. N., 1956. “The Generation of Waves 

in an Infinite Elastic Solid by Variable Body Forces”, Phil. Trans ., A., 248, 
575 - 

Erd£lyi, A. (Edit), 1954. Tables of Integral Transforms. Vol. 1. New York: 
McGraw-Hill Book Co. 

Jeffreys, H., 1930. “The Thermodynamics of an Elastic Solid,” Proc. Camb * 
Phil. Soc.y 26, 101. 

Lamb, H., 1904. “ On the Propagation of Tremors over the Surface of an Elastic 
Solid”, Phil. Trans ., A., 203, 1. 

Lessen, M., 1956. “Thermoelasticity and Thermal Shock”, /. Mech. Phys. 
Solids , 5, 57. 

-> I 957 - “The Motion of a Thermoelastic Solid”, Quart . Appl. Math ., 15, 

105. 



176 Stresses Produced in Elastic Bodies by Uneven Heating 


Neumann, F. E., 1885. Vorlesung uber die Theorie der Elasiicitat der Pesten 
Korper . Leipzig: 

Sokolnikoff, I. S., 19 56. The Mathematical Theory of Elasticity, New York: 
McGraw-Hill Book Co. 

Sneddon, I. N., 1951. Fourier Transforms . New York: McGraw-Hill Book 
Co. 

-, 1958. “The Propagation of Thermal Stresses in Thin Metallic Rods”, 

Proc . Roy. Soc . Edin A., 65, 121. 

-, and Berry, D. S., 1958. “The Classical Theory of Elasticity”, Handb. 

Phys ., 6, 1-124. 

Voigt, W., 1910. Lehrbuch der Kristallphysik. Berlin: Teubner-Verlag. 
Watson, G. N., 1944. A Treatise on the Theory of Bessel Functions . 2nd cd. 
Cambridge University Press. 

Weiner, J. H., 1957. “A Uniqueness Theorem for the Coupled Thermoelastic 
Problem”, Quart . AppL 3 fath., 15, 102. 


(Issued separately October 26, 1959) 



The Free Commutative Entropic Logarithmetic 


177 


XI.— The Free Commutative Entropic Logarithmetic.* By 
H. Mine, M.A.(Edin.).f Communicated by Dr I. M. H. 
Etherington. 


(MS. received May 12, 1958. Read November 10, 1958) 


Synopsis 

The commutative and entropic congruence relations determine a homomorphism on 
the free logarithmetic £, the arithmetic of the indices of powers of the generating element 
of a free cyclic groupoid. A necessary and sufficient condition that two indices should 
be concordant ( i.e . congruent in the free commutative entropic logarithmetic) is that the 
bifurcating trees corresponding to these indices should have the same number of free 
ends at each altitude. It follows that the free commutative entropic logarithmetic can 
be represented faithfully by index ^-polynomials (or ^-polynomials) in one indeterminate. 

In the concluding section enumeration formulae are obtained for the number of 
non-concordant indices of a given altitude and for the number of indices concordant to a 
given index. 


i. Introduction 

Ti-IE present paper is a continuation of a previous paper (Mine 1957 J) 
and uses the nomenclature of the latter without further explanation. 

Addition in the free logarithmetic £ (ibid. y p. 321) is non-associative 
and lion-commutative. Multiplication in £ is associative and right- 
distributive but not commutative nor left-distributive. The following 
congruence relations determine therefore homorphisms on £: 


commutative: F+Q r*j Q+P, (c) 

palintropic: FQ ~ QF, (p) 

left-distributive: (P -b Q)F ~ FR + QR , ( d ) 

and entropic: (F + Q) + (R + S) ~ (F + R) + (Q + S) (e) 

([cf . Etherington 1949). 


We denote the homomorph of £ determined by congruence relation (r) 
by £ f . It is known that £ e is a homomorph of £ p and that 2 ^ is isomorphic 

* This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 

+ Present Address: Department of Mathematics, University of British Columbia, 
Vancouver 8, B.C., Canada. 

t I take this opportunity to correct a mis-statement on p. 335 of that paper. In 
line 21 for “the potency of a prime tree is a prime number” read “a tree is prime if its 
potency is a prime number”. 



H. Mine 


178 

to £ d . The above four relations determine therefore only five distinct 
homomorphs: the free commutative logarithmetic £ c , the free palintropic 
logarithmetic £3,, the free entropic logarithmetic £ c , the free commutative 
palintropic logarithmetic £ CJ) and the free commutative entropic logarith¬ 
metic Sl ce - The study of Sl ce and its representations by ip- and 0-poly- 
nomials in one indeterminate forms the main part of this paper. The 
paper concludes with a section on enumeration of indices. 

The author is indebted to Dr I. M. H. Etherington for many helpful 
suggestions. 


2. Congruence Relations on £ 

P—Q means that P and Q represent the same index in £ or the same 
tree in X (v. Mine 1957, p. 322). We shall say that P is congruent to 
Q modulo (r) and write P ~ Q mod (r) if (r) is an equivalence relation 
on £ and 

either (i) P = Q; 

or (ii) P ~ Q mod (r) by direct application of (r) (e.g. 2 +3 ~ 3 +2 
mod (c); 2 -3 ~ 3 • 2 mod (p); etc.); 

or (iii) P=P' +P", Q = Q' + Q" and P' ~ Q', P" ~ Q" mod ( r ); 

or (iii') P=RP' and Q =RQ' and P' ~ Q' mod (r) ; 

or (iv) ~ Rk = Q where xnod ( r ) 

(1 < * < k — 1) by virtue of (i) or (ii) or (iii) or (iii'). 

We prove that “congruence” on £ thus defined is a congruence relation 
in the usual sense (cf. Birkhoff 1948, p. vii). It is obviously a congruence 
relation for addition. It suffices to prove 

Theorem 1.—-If P ~ P' and Q ~ Q’ mod (r) then PQ ~ P'Q' mod (r). 

Proof. — PQ ~ PQ\ by (iii'). We prove that the premises of the 
theorem imply PQ' ~ P'Q' mod (r). Use non-associative induction on Q' 
(v. Mine 1957, p. 322, footnote). If Q' = 1 there is nothing to prove. 
Otherwise let Q' = Q[ + Q'. Assume that PQ[ ~ P'Q[ and P<2' ~ P'Q' % . 
Then, by (iii), PQ' X + PQ ' 2 ~ P'Q'i + P'Q' 2 , i.e. PQ’ ~ P'Q', since multiplica¬ 
tion in £ is right-distributive. 

For all congruence relations considered in the present paper case (iii') 
of the definition follows from the other four cases. 

Theorem 2. Let p x , p 2 , p 3 be equivalence relations on £ defined as 
follows: 

CO PpiQ if (I) P=S+ T and Q= T+S; 



The Free Commutative Entropic Logarithmetic 179 

(2) P Pi Q if (I) P = ST and Q = TS; 

( 3 ) Pp 3 Qif(l)P=(S+T) + (! 7 +V) and Q=(S+ (I) + (T+ V); 
also PpiQ 0'=i, 2, 3) if 

either (II) P = Q; 

or (III) P=P'+P", Q = Q' + Q“ where P' Pi Q' and P" Pi Q") 

or ( IV) P=R 1 p { R 2 p i R s p ( . . . p;Pic = Q where R & PiR s+1 by virtue of 

(I) or (II) or (III). 

Then Pp 2 Q is equivalent to P ~ Q mod (c), 

Pp 2 Q is equivalent to P Q mod (p) 
and PpzQ is equivalent to P ~ Q mod (e). 

Proof. — p v p 2) p s are obviously congruence relations for addition. It 
remains to prove that Pp { Q implies RPpiRQ (i — 1, 2, 3). We consider 
the four cases in which Pp t Q. 

(1) If (I) P = S+T and Q = T+S then 

R(S + T)=RS+RT, since multiplication in £ is right-distributive, 
P1.RT+RS, by Oh), 

=R(T+S). 

Hence, by (IV), RP Pl RQ. 

(2) If (I) P = ST and Q = TS the proof is by non-associative induction. 

When T = 1, P = Q and thus RPp 2 RQ, by (II). Let T= F + T" 
and assume that RST'p 2 RT'S and RST"p 2 RT"S. 

Then RST=RST' + RST" 

PART'S + RT"S, by the induction hypothesis and (III), 

Pi SRT' + SRT", since ( R T')Sp 2 S (R T ) and 
(RT")Sp 2 S(R T"), 

=SR(T' + T") 

=SRT=RTS, by (p 2 ). 

Hence RPp 2 RQ. 

(3) If P =(S+T) + (U+V), Q=(S + U)+(T+V) then 

RP = (RS+ RT) + (R(f+RV), since multiplication in £ is 

right-distributive, 

p 3 (RS+RU) + (RT+R V ), by ( Ps ), 

=i?{(S+Z7)+(r+F)} 

=RQ. 

Hence RPp 3 RQ. 



i8o 


H. Mine 


Further, for all three relations: 
if (II) P = Q,we have RP=RQ and thus RPp { RQ\ 

if (III) P=P'+P ", Q = Q' + Q“, where P'p { Q', P"piQ", the result is 
easily provable by induction on altitude (or potency) of P; 

if (IV) P=R 1 p i R i p i R i p i . ■ ■ PiRu = Q, where R s piR a+1 by virtue of 
(I), (II) or (III), the proof is by induction on k. 

It follows from the above theorem that congruence relations mod (J>), 
(c), (e) are completely defined by cases (i), (ii), (iii) and (iv) of the definition 
in § 2 and in all subsequent proofs in which it is premised that indices or 
trees are congruent it will suffice to consider these four cases only. 


3. Free Commutative Logarithmetic. Free 
Entropic Logarithmetic 

Commutative logarithmetics have been studied by Etherington (1939, 
1940 and 1949). In this paper we only add a theorem on faithful representa¬ 
tions of the free commutative logarithmetic 

Denote the homomorphs of X F, ©, Q (the algebras of all >(<-, 6 -, co-poly¬ 
nomials) determined by congruence relations 

^(A, p) ~ 4>(p, A), 0 ( A, p) ~ 9(p, A), co(A, p) ~ a>(p, A) 

by 'E c , © t , Q c respectively. 

THEOREM 3. —£„ is faithfully represented by 'F r ; also by 0 , and by 

n £ . 

Proof. —We prove first that 

</r(A, p) ~ p(p, A) (c) 

implies f P+Q ~ ^q + p . 

'Ap+q (A, p) =*(>p( A, p) ■ \+ip Q (\, p).p 

~^p+«(^A), by (c), 

=i/i F (p, A ).p+i/j Q (p, A).A 
~*Ap(A : p).p + <p Q ( A, p) .A, by (c), 

= t l>Q + P (A, p). 

To prove the converse, i.e. that tjj p+Q ~ ip Q+p implies (c), note that. ^ ~ 
mod (c) and use induction on the altitude of P + Q. When ap +Q = i, 



The Free Commutative Entropic Logarithmetic 


181 


t l J p+o,(X, p) = A + fi while X)—)jl + X. Suppose the theorem holds for 

altitudes less than a (a > i) and let a p+Q =a. Then 

•Pp+q (X, P) ~ Pq+p (X, p) 


= 4i P { A, ft) . + p Q (X, ft) . A 
~ ppifj., A). ft + ip Q (p, X). A, 

=>Pp+q (p, X). 


by the induction hypothesis 
since a F , clq < a , 


The proof for 0 - and co-polynomials is almost identical. 

The free entropic logarithmetic Q e is a homomorph of the free palin- 
tropic logarithmetic, that is to say PQ and QP are congruent in Q e for all 
P and This result has been essentially obtained by Murdoch (1939, 
Corollary to Theorem 10) and in a more general form by Etherington 
(1949, Theorem 4). Etherington has also proposed the question (1951, 
p. 249) whether the free entropic logarithmetic is represented faithfully 
by index ^-polynomials in commuting indeterminates A, Call index 
polynomials in commuting indeterminates palindromic . It is known that 
palindromic ^-polynomials represent faithfully the logarithmetic of the 
general train algebra of rank 3 (v. Etherington 1951, p. 249). Ethering- 
ton’s question amounted therefore to this: are the free entropic logarith¬ 
metic and the logarithmetic of the general train algebra of rank 3 iso¬ 
morphic? In 1954 I communicated to Dr Etherington the following 
example which answers the question in the negative. 

Example .—The indices ( 4 *fi) + (1+3) and (3 + 1) +(1 +4) are not 
congruent mod (e) although their palindromic ^-polynomials are both 
equal to A 2 ^ 2 4- A 2 /x 4 - A/P 4 -A 2 4 -/P + A-b/x 4 -i. 


4. Free Commutative Entropic Logarithmetic 

This logarithmetic has particularly interesting faithful representations 
by index polynomials. We introduce for it a special nomenclature and 
notation. If two indices or trees, P and Q, are congruent mod (c) (e) we 
call them concordant and we write P ^ Q, 

Observe that (P + Q) + (Jl + S) **(A+ 3 ) + (C 4 - D) where (A , B , C, D) 
is any of the 4! permutations of (P, Q , R y S). Indeed this fact together 
with the relation 1 +P ^ P + 1 are equivalent to (c) (e). This suggests 

THEOREM 4.—If two subtrees of P of the same order *.be transposed 
the resulting tree Q is concordant to P. 

* Vn Mine 1957, p. 322. 



Proof .—The theorem holds trivially when P is of altitude I. We use 
induction on altitude and assume that the theorem is true for trees of 
altitudes less than a . Let P be of altitude a. 

(1) If P=P' + i or i -f P', both subtrees must belong to P', a tree of 

altitude a - i; the result follows by the induction hypothesis, 

(2) If P = (P x + P 2 ) 4- (P 3 + P4) then: 

(a) If both subtrees belong to P 1 +P 2 , a tree of altitude a-i, 

the theorem again follows by the induction hypothesis. 

Similarly if the two subtrees belong to P3 + P4. 

(b) If one subtree belongs to P x and the other to P 3 (or one to P 2 

and the other to P 4 ) the result follows from (a) since 

(P1 + P2) + <P 8 + A) ~ (Pi + Pa) + (P« + P4). 

(c) If one subtree belongs to P x and the other to P 4 (or one to 

P 2 and the other to P 3 ) the result follows from (a) since 

(Px + P*) + (Pa + P4) ~ (Pi +P 2 ) + (P 4 +P3) ^ (P1+P4) + (P* + Pa). 

{Note .—The proof of the equivalent proposition (which is false; 
v. Example in § 3) for the non-commutative entropic logarithmetic fails 
in case (2) (*).) 

We shall require a more general form of this result. 

Lemma.— If P has a free end at altitude a then any tree Q concordant 
to it has also a free end at the same altitude. 

Proof —The lemma is quite obvious 

if ® P=Q, 

or if (ii) (1) P=(P + 5 ) + (P+£/)and 0 = (P + P) + ( 5 + U) 
or if (ii) (2) P = R + S and Q-S + R. 

It is easily provable by induction on altitude 
if (iii) P=P + 5 , Q =P' 4- and R ^ R\ S ^ S' 
and by induction on k 
if (iv) P—R x '<■' P 2 ~ . . . ro Rjc~Q- 

If R is a subordinate of the nth order of P {v. Mine 1957, p. 325) we 
shall call P a superior of the »th order of P. We shall use the following 
notation: Let a superior of the first order of R be denoted by R or by R (a) 
if the node of the additional fork in the superior is at altitude a. R (or R (a) ) 
denotes a definite though unspecified tree, not the set of all trees having 
R for a subordinate. 



The Free Commutative Entropic Logarithmetic 183 

Theorem 5* If p and Q are two concordant trees each with a free 
end at altitude a then P(a) '*■' Q(a)- 

Proof .—Consider in turn the four cases defining P ^ Q. 

(i) If P—Q the result follows from Theorem 4. 

(ii) CO PH& + S)HT+U) and Q = (R+T) + (S+U). One at 

least of R, S } T y U has a free end at altitude a - 2; let it 
be R. Then 

P(a) ^ (R(a- 2) + S) + (T+ U) } by Theorem 4, 

~(R(a-*) + T) + (S+U), by (4 
^ Q(a)> by Theorem 4. 

(ii) (2) P — R + S and Q=S + R. First suppose that R has a free 

end at altitude a - 1. Then 

P(a) ** P(a-i) + C, by Theorem 4, 

^S + &(a~ih by (4 
~ Qiah by Theorem 4. 

If R has no free ends at altitude a-i, S must have one; the 
proof is then similar. 

(iii) P=R + S, Q=R' S' and R ^ R\ S ^ S', Suppose that R has 

a free end at altitude a~i. Then, by the lemma, R' has 
a free end at the same altitude and P ia) ^ R^^ + S, 
£2( a) ^ R' (a _ l) -rS', These are concordant if ^ R\ a -iy 

Use therefore induction on a. Again, if R has no free ends 
at altitude a - 1 then 5 must have one and the proof is similar. 

(iv) If P=R 1 ^ R 2 ^ ^ R?c~Q then by the lemma each Ri has 

a free end at altitude a. The proof is by induction on k. 

We are now in a position to prove the principal theorem on the structure 
of concordant trees. 

THEOREM 6.— Two trees are concordant if and only if they have the 
same number of free ends at each altitude. 

Proof ,—Let the two trees be P and Q. 

Necessity .—Let p i} q u r iy $ i} t i} denote the numbers of free ends at 
altitude i in trees P y Q , R } S , T, U respectively. 

(i) If P~Q there is nothing to prove. 

(ii) (1) If P^(R + S) + (T+U) and Q=(R+ T) + (S+ U) then 

pi —$i — ^‘_2 2 "b i'i —2 (2 < % < Ctp) and pQ —pi == §o 

=5 r 1 =o. 



184 


H. Mine 


(ii) (2) If P=i? + S and Q = S + R thenp i =q i =r i . 1 +s ( . x (1 < i < a p ) 

and p 0 = q 0 =o. 

(iii) P = R + S, Q = T+ U and R^T, S ^ U. The condition is 

obviously necessary if a P —\. Use induction on altitude of P. 
The lemma to Theorem 5 implies that altitudes of concordant 
trees are equal. The altitudes of P and Q are therefore equal 
and those of R, S, T, U are all less than a P . Thus, by 
the induction hypothesis, = t t and s t = u t for all i. But 
pi = q-i =Hence p t = q t . 

(iv) If P=R 1 ^ R 2 ^ Rk = Q the necessity is proved by 

induction on k. 

Sufficiency. —Note that the potency of any tree T is 8y = anc * use 

i 

induction on the potency of P. 

If 8 P = i, P=Q = i and the condition is obviously sufficient. Assume 
that it is sufficient for trees of potency less than d. Let S P = S 
a F = a Q — a and let R , 5 be the first principal subordinates («/. Mine 1957, 
p. 325) of P, Q respectively. Then, since pi—qi (1 < i < a), =$<==£< 

(1 < i < a- 2), r a ^ 1 =s a _ 1 =p a „ l + 1 and r a ^s a ^p a -2. But the potencies 
of R and 5 are equal to d~ 1. Thus, by the induction hypothesis, R and 
S are concordant and, by Theorem 5, R^ n _ ^ Now, by Theorem 4, 

P R^-i) and Q ** 5 {a _ iy Hence the result. 

A similar necessary and sufficient condition can be obtained for 
numbers of nodes (or of all knots) at each altitude. 

5, j ft- AND 0 -POLYNOMIALS IN ONE INDETERMINATE 

The altitude of a knot is equal to the degree in A, /x of its term (y. Mine 
1957). This and Theorem 6 suggest that concordant trees (or indices) 
can be represented by polynomials in one indeterminate in which the 
degree of each term corresponds to the altitude and the coefficient to the 
number of free ends at this altitude. We now introduce such index 
polynomials, study their properties and prove that is faithfully repre¬ 
sented by them. It turns out that these are Etherington’s original index 
polynomials {cf. Etherington 1940). 

The algebras of the two types of index polynomials (< cf ’ Mine 1957) 
defined below are homomorphs of X F and © determined by the congruence 
relations: 

H) ~ p) if A) - f (A, A) in 2R[A, ; 

d(\,ix)~d'(X,fi) if 0 (A, A) = Q'(\, A) in ( d\. 



The Free Commutative Entropic Logarithmetic 185 

It is convenient therefore to call them index ifs- and 0 -polynomials in one 
indeterminate and to denote them by fa( A) and 0 P ( A) or, where no confusion 
is likely to arise (as in this section), simply fa and ^-polynomials and to 
write i/jp and 9 p . 

Definitions .—(i) Index ?/r~polynomials in one indeterminate: 

*Ai(A) == 1 j *Ap+q(A) - A{^p(A) 4 fa( A)}. 

(ii) Index ^-polynomials in one indeterminate: 

$i(A) ass o, 0 />+q( A) = A{0 P (A) 4 9q( A)} + 1 . 

We have i/j p ~( 2 A- i)0 P 4 1 . This is easily proved by non-associative 
induction. For, since 0i=o, fa =(2 A - i)9 x 41 and if we assume that 
— (2A~ i)0 Q +1 and fa~(2X- i)0 p 4i then 

= A{(aA - 1 )9q 4 1 4 (2 A - I )9 r 4- 1 } 

= (2 A -1){ X(9q + 9 r ) 4 1} 41 
= ( 2 A-l) 0 Q+i 2 4 l. 

Call the term of maximal degree in A in a polynomial faX) the leading 
term of faX). It is easily seen that all coefficients in index polynomials 
defined above are non-negative integers and that the coefficient of the 
leading term of i]j p is even. 

THEOREM 7 .—The polynomial (j> = 2A*4<£', where fa is a polynomial in 
A with non-negative integer coefficients, is an index ^-polynomial if and 
only if A*~ x 4 fa is one. 

Proof .—If 2 A ij rfa is an index ^-polynomial, ift p say, then (2 A* 4$') 
- 2 A* 4 A^ 1 = A*" 1 4 fa is the ^-polynomial of a first subordinate of either 
P or of a tree concordant to P. Again, if X l ~ Xj rfa is a ^-polynomial, 
fa say, then (A* - "-^^') - A*~ 1 42A* = 2A*4<£' is the ^-polynomial of 
a first superior of Q. To these somewhat loose remarks we add a formal 
proof. 

Necessity .—Use induction on n(fa), the degree of <f>. If n(fa) — i, 
<f>~2 A, i.e. 2 = 1 , fa =0 and therefore A*” 1 4 <£'= 1 —fa. Assume that the 
condition is necessary for ^-polynomials of degree less than m. Let 
n(fa) = m. Then (f> = 2 A* 4 fa = A p A 4 A fa and either (a) $ A or contains a 
term kX^ 1 with k > 2 ; or (b) i/f A and fa each contains a term A* -1 . 

If (a): suppose that fa contains a term kX** 1 and let fa—zX^ + fa 
where fa is a polynomial with non-negative coefficients., Then, since 
n(fa) < m - 1 , X^ + fa is a ^-polynomial and 



== \ijj A + Xifj B - 2 A* + A *” 1 

— 4- (2A* 4- A cf> B ) ~2\ i + A^ 1 

-^ + A(A«+^) 

is also a i/r-polynomial. 

If (£): let ifs A = A^ 4- A^ # , */r B = A^ x 4- A^ s and suppose that ift A% and </^ 
each contains a term A*~ 2 . Then <f> = A(A?/^ lj 4* At/;^) 4- A(Ai/^ a + At/^ g ) = 
A^r c + A^, say, where i/j g contains the term 2A i '~ 2 and the proof proceeds as 
in case(^). 

Sufficiency .—Let *Ap= A^ 1 4 - <f>'. Then =^ 2 > H- (2 A - 1) A 4 ” 1 . Use 

induction on n(ifj p ). If n(i/j p )=o «/r p =i, £ = 1 and </> = i + (2A~ 1) = 2A = «/r 2 - 
Suppose that the condition is sufficient for polynomials of degree less than 
m (m > d) and let n(ifj p )~m. Then ift p = Xi/j q + Xi/j R and either i/jq or if) R 
contains a term kX^\k > o); let it be if/ R . Hence can be written in 
the form A i ^ 2 + <f> R , where <f> R is a polynomial with non-negative coefficients, 
and, since n(if/ R ) < m , 2 A*" 1 +<5 j> R is an index ^-polynomial. But 
<f>^i/f P + (2 A - i)A M 
- Xfa + A(A*~ 2 +<f> R ) + (2 A -1) A*" 1 
-A*| + A(^ + aA«) 

and is therefore also a ^-polynomial. 

Corollary i. —A necessary and sufficient condition for $(A), a 
polynomial of degree n (n> 1) with positive integer coefficients to be an 
index ^-polynomial is that <f> -(2 A - i)A n ~ a should be one. 

Corollary 2.—<ji>(A), a polynomial ,with positive coefficients and 
leading term kA w , is an index 0 -polynomial if and only if <j> - (2A - 1)JkA^ 1 
is one. 

n 

Necessary and sufficient conditions that 2 . K A* should be an index 

t =0 

^-polynomial have been given by Etherington (1951, p. 251). They are: 
(i) all Kt are non-negatiye integers; (ii) if n¥=o, k { < 2 k u1 ; (iii) k 0 -o or 1. 
The necessity of these is quite obvious. The sufficiency can be proved by 

n n 

above Corollary 2. For ’^ J K i X i is a ^-polynomial if <f>=(2 A - iY 'Y.K i \ i ) +1 

t=sO 

is a polynomial and this is so if 

<f> f ® (2A - 4 * 1 - (2 A - i)* n A n * (2A -1)( y KiX*) +1 

t=o 

is one* Now the degree of <f>' is less than that of </>. The proof is by 
induction on degree. 



The Free Commutative Entropic Logarithmetic 


187 


It is worth noting that if 2 ,/q-A* is an index ^-polynomial then so is 

4 = 0 

r 

2,(o < r < «). 

4 = 0 

THEOREM 8.—The free commutative entropic logarithmetic is faith¬ 
fully represented by index e/r-polynomials in one indeterminate. 

Proof ‘ I.—To prove that concordant indices have the same ^-poly¬ 

nomial, i.e. that (P 0)=>{^p(^)=^<j(A)}: 

(i) If i 3 = Q then obviously i/t P = i/f Q . 

(ii) (1) IfP = (ie + 5 )+( 7 ’+£/)and 0 = (P + P) + (S + £/) then 

4>P = = A 3 (iAie + <As + i /'y + ^J7>- 

(ii) (2) If P = R + 5 and 0=5 + then 

<Pp = ’pQ = Ki>R + >Ps)- 

(iii) If P=F + S, Q=T+ U and R ^ T } S ^ U then if/ F = \(ip R * 

and = A(^p •+• which are equal if \jj R — ijs T and ^ = ^7. 

Use induction on altitude. 

(iv) If P^=R 1 ^ R 2 ^ ^ Rjc — Q} use induction on k. 

II.—To prove that (ifj p = i/jQ)==>(P ^ Q ): 4 i p~ x l } o, implies that 8^ = 8^ 
and ap — aQ. Use induction on S p . If Sp = i, P~Q~ 1, Assume that 
the theorem holds for indices of potency less than af. Let S P — d, Con¬ 
sider the trees P and Q . Their first principal subordinates have both the 
^-polynomial p P - (2 A - 1) A 0 -*’" 1 . The potency of these subordinates is 
d- I so that, by the induction hypothesis, they are concordant. Now, 
P and Q are their superiors satisfying the premises of Theorem 5. The 
result follows. 

COROLLARY.—The free commutative entropic logarithmetic is faith¬ 
fully represented by index 0-polynomials in one indeterminate. 

For 

-<==>{(2A - i) 0 P +1 — (2A -1 ) 0 Q +1} 

<=>(0p = 0 Q ). 

6 . Enumeration of Indices 

The numbers p a of possible indices in £ of given potency 8 and of 
given altitude a respectively are given (■ v . Etherington 1939) by the 
recurrence formulae 



i88 


H. Mine 


a 6 ~a 1 a 6 _ 1 + a 2 afi„ 2 + a ii a 5 _' i + , . . 

A+x = 2 A(/o + A+A + • • • +A-i) +/i A^ 1 * 

The formula for b 6) g a , the corresponding numbers of possible non- 
congruent indices in £ c , are (ibid.): 

h x — b 2 ~q Q — i, 

-1 53 - 2 + a _ 3 + • • . + 4 v~ A> 

b 28 ~b 1 b 2& ^ 1 + b 2 bM_2+ . . . H*A - A+1 + 1A(4> + 0 l 

+1 = 4^0 + + !?2 + « * * +^a«l) +1?’«(^«+ x )* 

If we denote the number of indices in Q of altitude not greater than a by 

a 

s a) i.e. 4 = 2, A> formula for j> a + 1 becomes 

i=0 

A + l = 2 A^-l+/a 

= A-i+A) 2 -*2-a 

Alternatively, 

A+rA( f «+-'«-i) 

tt 

The first of these formulae gives 

+1 * * 4* 

a 

Similarly if 4 = 2 ?*’ we obtain the corresponding formulae 

i-0 

2^a+l= = ^~4-l+^a ss; ^a(4 + 4~l + 4 

a 

2a +1 ^ ^11 + i), 

i = l 

2 4 +i =s 4 + 4 + 2 - 

The enumeration of indices in other free logarithmetics discussed in 
this paper is more difficult, I give below a formula for r a , the number of 
all non-concordant indices of a given altitude. As far as I am aware, the 
other relevant enumeration formulae have not yet been found. 

Let k be a non-negative integer, A an indeterminate and i any non¬ 
negative integer such that 2 i > k . Denote by A the operator defined as 
follows: 

A(k A*) = k A* if /< = o or i 

and 


A(/cA^ = (k - 2 )A* + A* 1 if k > 2 . 



The Free Commutative Entropic Logarithmetic 


189 

Define the A-value of /c, denoted A K> as the number of all possible (different) 
polynomials in A obtained by operating with A in all possible manners on 
k and on terms of thus derived polynomials. 

Example .—To find the A-value of 7. We have 

A(7A i )=5A i + A*'~ 1 J 

(A(5 A*)) + A*" 1 = 3 A* + 2 A^ -1 , 

(A( 3 A 0 ) + 2 A*" 1 = A* + 3A*" 1 , 

3 A* + A(2 A 1 '” 1 ) — 3 X 1 + A* -2 , 

A* + A(3 A'" 1 ) = A* + A« + A i_2 , 

i.e. 6 distinct polynomials and it is impossible to obtain more than 6. 
Hence the A-value of 7 is 6. 


It is easily seen that 

A 0 = i, 

A 4 — 4, 

As = 10 , 

A 12 = 20 , 

Ai = i, 

A 5 —4, 

Ag = 10 , 

A 13 = 20 , 

A 2 = 2 , 

A 6 = 6 , 

A10 = 14, 

Aj4 = 26, 

A 3 = 2 , 

A 7 —6 } 

An = 14, 

A 16 = 26, etc. 


In fact we have 
Lemma.— 


Agx+i Ag* 2<Ar* 

ra 0 

Proof ’.—Use induction on /c. The formula gives correct A-value for 
/c = I. Assume that the formula holds for integers less than k. Consider 
A Sje , the number of all distinct polynomials which can be obtained from 
2k\* by the process described above. All such polynomials with a term 
in A < are obtained from 2/cA* = 2A* + 2(#c - i)A* by operating with A in all 
possible manners on the term 2(/c-i)A i and on terms derived from it. 
There are Ag^^ such polynomials. All the derived polynomials of degree 
less than i are obtained by operating in the same way on icA w . There 
are A K of these. Hence A 2 k ~A 2 ( k _ 1 ' > +A k . But, by the induction hypo- 

K - X K - 1 K 

thesis, = 2 A r . Thus A 3k = ^,A r +A K = ^A r , A 2(C+1 is the number 

f-0 r=0 r-0 

of possible polynomials obtained from (2*4-1) A* by the same process. 
Now, (2 k 4- i)A € = A* 4- 2 /cA* and since A(A*) = A* all the required polynomials 
are obtained by operating with A on the term 2 kX* and on terms derived 
from it. Thus A 2k+1 =A 2/c . 

P.R.S.E.—VOL. LXV, A, 1958-59, PART IX 


13 



190 


H. Mine 


Denote by r a the number of all index ^-polynomials, i/r(A), of degree a, 
i.e. the number of all non-concordant indices of altitude a. 

Theorem 9. 


2“-l 

r a+ 1 = 2 A’ Le - r a =A 3 « _ 2 . 

i=0 

Proof .—All possible trees of altitude a -f1 are subordinates of the 
plenary tree 2 a+1 . Moreover, Theorem 7 implies that if we operate with 
A on a term of a ^-polynomial if/ P and the resulting polynomial <f> differs 
from then </> is the ^-polynomial of a first subordinate of P. Thus all 
index i/r-polynomials of degree a + x can be obtained by operating with A 
on 2 a+1 A a+1 , the index ^-polynomial of the plenary tree 2 a+1 , and on terms 
of the derived polynomials in such a way as to leave in each resulting 
polynomial a term in A a+1 . We can obtain all these polynomials in the 
following way: first operate with A on the leading terms only and obtain 
the sequence of i/r-polynomials of the first, second, . . . , (2 0 - 1 )th principal 
subordinates of 2 a+1 : 

2 a+1 A a+1 , (2 a+1 -2)A a+1 + A a , (2 a+1 -4)A a+1 + aA a , . . 

( 2 a +1 - 2 t) A a+1 4 *ZA a , . . . , 4 A^ 1 + (2 rt -2)A a , 2 \ a + X +( 2 tt ~l)\ ( \ 

Now, from each (2° +1 - 2t)X x+1 + iX a we can obtain all t/r-polynomials of 
degree a +1 with leading term (2 a+1 - 2z)A a+1 by leaving the term in X 1 * 1 
alone and operating with A on iX a and on other resulting terms. But, by 
the definition of A-value, we can obtain in this manner exactly A* poly- 
2 a -1 2 a ~ 1 -1 

nomials. Hence r a+1 = 2 A<- Now, by the Lemma 2 , A i =A 5J a„ 2 and 

i~Q 0 

so r a = A 2 <*_ 2 . 

2 a 3 ? 4 > 5 ? 7 ? • • • 

r, 1, 2, 6, 26, 166, 1626, 25510, .... 

/(#) ^Ao + A x # + A 2 x* + . . . . 


-— (A 0 +A 1 a+Aaic s + . . . )(i+#+# 2 -k • .) 

=Aq + (Aq +A x )# + (A 0 4 * A x -fAg)# 2 +. . . 

~Aq +A 2 # +A ^ 2 +. . . =A X +A 3 # +A 5 a? 2 + . . . . 


For a = 

Let 

Then 



The Free Commutative Entropic Logarithmetic 


191 


Hence 




f(x) = 


/W 

1 —x 2 


xf(x 2 ) 

+ - 

1 -x 2 


/(*)- 


/(* 2 ) 

I 


This functional equation is easily solved by iteration: 


/(*)- 




i/w-. *. = n(i -^V 1 


~(l+X+X 2 +X Z +X* + . . . )(l +^ 2 + ^ 4 + . . . )(l +^ 4 +X 8 + . „ . ) 

(i +X 8 + . ..)... * 

Thus r a (a > o) is the coefficient of x 2a ~ 2 in the Maclaurin expansion of 
this function. Alternatively, r a is the coefficient of x 2a in 1 -f #y(#) 
(all a). 

(The preceding paragraph on the generating function fix) was com¬ 
municated to me by Dr Etherington.) 

Finally we prove a formula for n F} the number of trees (or indices) 
concordant to a given tree P. This formula is derived from a 
similar formula (for the number of indices having the same given palin¬ 
dromic index ^-polynomial) communicated to me as a conjecture by 
Dr Etherington. 

THEOREM 10.—If P is a given tree and 

4 >p =2 T i^> ^>=2 

i i 

then 

n,-Il( T,+,r ‘Y 

♦ \ l 

Proof .—Use induction on the altitude of P. The formula obviously 
holds for altitude 1. Assume that it holds for trees of altitudes less than a . 
If P is of altitude a then = -(2A- i)|T a A a “ x is the ^-polynomial of 

the (%T a )th principal subordinate of P } a tree of altitude a - 1. Every tree 
whose ^-polynomial is equal to \fj Q has r a ^ + \r a free ends at its maximal 
altitude a - 1. To obtain all trees concordant to p F we join the nodes of 
forks in all possible manners to these free ends. This can be done in 

( r a-x + ha 

\ K 


distinct ways for each tree whose ^-polynomial is equal to 




192 


The Free Commutative Entropic Logarithmetic 


Hence n p = ( Ta 1 + 2Ttf \.n Q . But, since */f p =(2A~ i) 0 p + i, \ r a = rr a ^ x and 
\ i T a I 

therefore n P = ( Ta ~ x + 7T °” 1 ].rt Q . Now, by the induction hypothesis 
n 0 = II ( T% + 7Tl \ The result follows. 

i= l\ 7Ti / 

Example .—To find the number of trees concordant to 3.3.4. 

^3.3.4 = SA 7 412 A 6 4 10 A 5 + 5 A 4 4 A 3 , 

^3.3.4 -4A 6 + 8A 5 4 9 A 4 + 7 A 3 44A 2 4 2 A 41. 


A* : A 6 , 

A 5 , 

A 4 , 

A ' ! i 

For all other terms either 
t jf or -jr* is 0 and thus 

: 12, 

10, 

5 , 

*5 

/T<+wA 

TTi : 4, 

8, 

9 , 

7 -. 

l IT, )" X - 

/16 N 

* 3 . 3.4 = 

\ 4 i 

k:: 

rx> 

= i 275 S °7 192 960. 


References to Literature 
Birkhoff, G., 1948. Lattice Theory. New York. 

Etherington, I. M. H., 1939. “On non-associative combinations”, Proc, Roy . 
Soc. Edin., 59, 153-162. 

-, 1940,1945. “ Commutative train algebras of ranks 2 and 3 ”,/. Land. Math . 

Soc. y 15, 136-149; 20, 238. 

-? I 949* “ Non-associative arithmetics ”, Proc . Roy . *SW. 1 Edin., A, 62,442-453. 

-, 1951. “ Non-commutative train algebras of ranks 2 and 3”, Proc . Lond. 

Math . 52, 241-252. 

Minc, H., 1957. “Index polynomials and bifurcating root-trees”, Proc. Roy . 
*SW. Edin. y A, 64, 319-341. 

Murdoch, D. C., 1939. “Quasi-groups which satisfy certain generalized 
associative laws”, Amer.J. Math., 61, 509-522. 


(Issued separately October 26, 1959) 



Solution of the Equation ze z — a 


193 


XII.— Solution of the Equation ze z -a* By E. M. Wright, 

University of Aberdeen. (With One Text-figure.) 

(MS. received January 9, 1959. Read March 3, 1959) 


Synopsis 

The roots of the equation ze z = a are of importance in several theories. Various authors 
have studied certain of their properties over more than a century. Here we solve the 
equation, in the sense that we define the sequence {Z n } of roots and, except for a small, 
finite number of values of n, find a rapidly convergent series for Z n . The terms in this 
series are alternately real and purely imaginary and so the series is very convenient for 
calculation. For the few remaining roots, we give practicable methods of numerical 
calculation and supply an auxiliary table. 

The main results of this article have been announced without proof or details in 
Wright 1959. 


Introduction 

1. The equation 

ze z =a {a 4=0), (1.1) 

or some trivial transform of it, plays a part in the iteration of the exponen¬ 
tial function (Euler 1927, Eisenstein 1844, Wright 1947) and in the theory 
and various applications of certain difference-differential equations 
(Polossuchin 1910, Schiirer 1912, 1913, Bellman 1949, Wright 1955). 
For this reason, several authors (Hayes 1950, 1952, Lemeray 1896, 1897, 
Polossuchin 1910, Wright 1955) have studied properties of its roots. 
In particular, the least upper bound of the real parts of the roots is of 
importance in stability theory and has received special attention (Hayes 
1950, Wright 195S). 

The equation (1.1) is equivalent to 

z + log 5= log a , (1.2) 

where we consider log a many-valued and fix the value of log z by cutting 
the 5-plane along the real axis from -00 to o. In the interior of the 
cut-plane we suppose that 

- 7 T < arg 5 = 4 /(log z) < 77 . 

We write z=x + iy and w=u + w, where 

w—w(z) =5 + log z 

* This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 



194 


E . M. Wright 


and x, y, u> v are real. We have 

dw z + i 
dz z 

and this vanishes at #= — I. The upper edge of the cut in the #«plane 
corresponds to a cut in the ^-plane along the semi-infinite sti aight line on 
which 

u < -1, = 7r, ( x * 3 ) 


the part on which # < - x corresponding to the upper edge of the cut (1.3) 
and the part on which — 1 < x < o to the lower edge. Similaily the lower 



edge of the cut in the #-plane corresponds to both edges of a cut in the 
w -plane on which 

U < - I, V «* — 7T. (x .4) 


The points at which «/ = - 1 ±i-n are branch points in the w-plane. 

We take K t d two positive numbers, of which K is large and d small, 
if, is the closed contour in the cut #-plane consisting of 

(i) the circle | z | described counter-clockwise, 

(ii) the upper edge of the cut from #= -K to *= -d indented above 
jsr= — 1 by a semicircle of radius d > 

(iii) the circle \ z\~d 3 described clockwise, and 

(iv) the lower edge of the cut from -d to - K indented as in 
(ii), but below 2= — i, The transform of 9 % in the z^-plane is 

shown in the figure, in which corresponding points in the two 
planes have the same letters. 

Now let Wq be any fixed value of w not on either of the cuts (1.3) and 
(1*4). A little consideration shows that, in the ze/-plane, as the 



Solution of the Equation ze z —a 


195 


curve ABC recedes to infinity and, as d-+ o, the curve EFG recedes to 
infinity and the loops at D and H shrink inwards onto the ends of the cuts. 
Hence, by taking K large enough and d small enough, we can ensure that 
encloses in its interior. Then, as z goes once counter-clockwise 
round w goes once counter-clockwise round and arg ( 'w-w 0 ) 
increases by 277. Hence there is just one z 0 within z such that w(z 0 ) =ze/ 0 . 
Thus in the interior of the cut ^-plane, the inverse function z~z(w) is 
uniquely defined. 

The function z=z(w) can be continued analytically across either of the 
cuts in the w -plane onto another sheet of the corresponding Riemann 
surface. For example, if we cross the cut (1.3) downwards, we pass into a 
sheet in which the branch points are at -1+772 and - 1 +3772. (If we 
encircle the first of these counter-clockwise and again cross the cut (1.3) 
downwards, we are back in the original sheet.) In every sheet of the 
Riemann surface the branch points lie on the straight line u— - I. 

Now suppose that w 0 is any point in the interior of the original cut 
ze'-plane. Let us draw a circle with centre w 0 and circumference passing 
through tlie nearer of the two branch points - 1 ±m\ so that its radius is 
min | - I ±iri-w 0 |. This circle may have part of its interior in another 
sheet of the Riemann surface, but it cannot have any branch point in its 
interior. If zv x is any point within the circle, i.e. if 

j w x - w 0 I < min | - 1 ±ttz - w Q |, (1.5) 


we have the convergent Taylor expansion 


sOi)= 2 O 0 ) + 2 <2fcOi - wq)*, 

k~l 

d k z 


(1.6) 


where Q k is the value of — —~r at w =w 0 . The value of z(w^) given by 

k\ dw k 

this formula is the unique value valid in the cut ^-plane, provided the 
straight line joining and w x does not cross one of the cuts. 

If w 0 or w x or both lie on one or other of the cuts (but not at a branch 
point), (1.6) is still true, provided we take the appropriate values of z(w x ) 
and z(w 0 ). If, for example, w Q and w x are both on the same cut, we must 
suppose them on the same edge of the cut and so take z(w 0 ) and z(w x ) both 
less than - 1 or both between - 1 and o. 


Enumeration of the Roots 
2, If a~Ae m , where -77 < a < 77, A > 0, we have 
log a — log A + i (a + 22277) 

in (1.2), where n is any integer. Taking this as w in the discussion of the 



196 

last section, we see that 


E. M. Wright 


Z n ~z {log A 4-/(a *4* 2?i7t)} (2.1) 

is a root of (1.1) and that, apart from an exceptional case, all the roots of 
(1.1) are given by (2.x), when n runs through all integers, positive, zero or 
negative. 

The exceptional case arises when zu~\og a lies on one or other of the 
two cuts (1.3) and (1.4) in the ^-plane. In this case a = 77, o or - 1 
and log A < - 1, i.e. a is real and - er x < a < 0. There are then two real 
negative values of z corresponding to the value log A 4- irr of w and the 
same two values of 2 corresponding to the value log A ~ irr. We take one 
of these values as Z and the other as Z 0 . We have Z^ x ^Z (h unless 
a=-e~ 1 i when Z^-Z^ — -1. This is the double root of (1.1) when 
a = -e~ 2 . 

Since the positive half of the real axis in the ^-plane corresponds to the 
whole real axis in the w-plane, Z 0 is real when a=o, i.e. when a is real and 
positive. 

Apart from these, there are no other real roots* For all the non-real 
roots, we put 

Z n -X n +iY n -Btf 1B * (0 < | | < tt ) ( 2 . 2 ) 

and have 

Y n +0 n = a + 2mr (2.3) 


by equating imaginary parts of (1.2). Clearly Y„ and 9 n have the same 
sign. Thus Y 0 lies between o and a, 


and 


(2 11 - i)tt + a < Y n < 2 ntr + a (n > 1) 

2«ir + a < Y„< 'yin + 1)7 r + a (n < - 1). 


These preliminary results are not new. For real a, they are well known 
and essentially due to Lemeray (1896, 1897). For complex a, they are 
given by Hayes (1950) in a different notation. These authors use (real 
variable) methods different from mine. 


3/ In (i.6), 
We have 


Three Expansions for z(w 0 - z(w 0 ) 
let us write z 0 =z(w Q ), y = w 1 -w 0 and 

dz z d z d 

dw x + z dw x + z dz 

d 

<2i=i+A <2*——-&-i C*> 2 ). 


— (i 


( 3 * 1 ) 


and so 



Solution of the Equation ze z ~a 


197 


It follows from (3.1) that 

2£-1 

Qk= 2 > 2 ), ( 3 - 2 ) 

h—k 

where all the b kn are positive numbers depending only on k and h. Now 
let us suppose that | z Q | > 1. We have 

Qk= 2 j 7—7 Vt = 2j- m - (k>2) 

ifcifoo + i ) 71 ^ 0 m 

(say), where is a positive number depending only on m and k. Again 

«. < T 1 « m 

*0 + 1 m~ 1 *0 

so that a ml = 1. Now, by (3.1), 

_ _ d Zq d 

hQk~~l — Qk-1= 7 “ -J-Qk-l 

dw o z 0 + 1 


and so 


K* + «o) 2 (- i )’X»»*<r m “ 2 (- i ) m+1 ««® 

m—k-1 


From this we deduce that 


xi-i 


and that 


Mn+i> & = 7c -i (*« > £)- 


Combined with the fact that a m 1 = 1, these enable us to calculate the a mk 
in succession. We can also establish by induction on k and m that 

2 k (m+k-i)\ 

where, as usual, o! denotes 1. 

We have now by (1.6) 

00 

z(wi) = Z 0 + 2 Qk7 k 


=z 0 +y + X/SC - 1 ) ma 'mk. z <r m - 


k—1 m—k 


The double series is absolutely convergent provided that' 

S 27 S(w-^)!(2i-i)!' Z ° 



ig8 E. M. Wright 

is convergent. Since | z 0 | > x, we have 

I g o 1I fo l~ fe y 1 + . 

(I*bl-*) B (i-l*b I- 1 )" ,£*(«-*)i (»*-*) i 

and so the double series (3.6) is convergent, provided that both of 

| *0 1 > 1 I 2 Y z o I < (I 2 o I " I ) 2 ( 3 - 7 ) 

are satisfied. When these are true, we can change the order of summation 
in (3.5) and find that 

00 

z(w x ) =z 0 +y + 2 (~ (3-8) 

OT« 1 

where 

m 

k = l 

is a polynomial in y with positive coefficients. We have /^(y) = y and, 
by (3*3) and (34), 

■*m(y) -^Tn-i(y) + ~ l)[ 

JO 

Hence, in particular, 


1 31 11 % 

P 2 —y + -y 2 , -P 3 == y+-y 2 + -y 8 , = y + 3y a 4* -r-y 3 4- -y 4 , 

2 23 64 

o j- 2 I 

“ y 4* 5y® 4* —y® 4* ——y* 4- —y®. 

We can deduce from (3.3) and (3.4) and the fact that a ml = I that, for 
any £, 




so that the left-hand side of (3.9) is a generating function of k\a mlc . 

Again, provided that 2|#y|(i+|/|)<i, we can deduce from (1.6) 
that 

00 

*W=%+ 2 * A x*(y)> ( 3 -*°) 

7i=0 

where 

Xo=xi =y, x»(y)= w ^ <h d khY k {h > 2) 

and 

X*'(y)-(A - i)xA-i(y) +(A - 2)Xh-i(y) (h > 2). 

But> as I have found no use for (3.9) or (3.10) in my present problem, I omit 
#$ie proofs, which are in any case sufficiently obvious. 



199 


Solution of the Equation ze z —a 

Calculation of the Distant Roots 
4. For every positive integer n we write 

H— 2 n 7 T 4 - a - f 3 = log (AjX). 

If we put z^~iH and z x —Z m we have 

y = w(Z n ) - w(iH) = (2^77 4 - a)i 4 - log ^ - iH - ^772 - log /?, 

which is real. If we use these values in the expansion (3.8), we have 

Z n =iH+p + (4.1) 

m — 1 

provided that (3.7) and (1.5) are satisfied, that is 

2jET\P\ (4.2) 

and 

| jS | < | w(iE) + I -772 I, 

which is 

(log A ) 2 < (&- £t7) 2 + 2(1 4 -log A) log +1. (4,3) 

Both conditions are obviously satisfied for large enough n% 

Since Z w ==JST w 4-2 F n , we have from (4.1) 

Y n ^E+rj, (4.4) 

where 

(4*5) 

and 

-Xn-jS (4.6) 

J-l 

If we wish to calculate but not F n , as is often the case in applications, 
we use (4.6). If, however, we wish to calculate both X n and Y n we can 
shorten our work by proceeding as follows. By (2.2) and (2.3), we have 

X n * Y n cot 6 n = Y n cot (a - Y n ) = (H+ 77) tan 77. (4.7) 

We now calculate 77 from (4.5) and have F w and from (4.4) and (4.7). 
Since 77 = < 9 (j 8 /i 7 ), tables of tan 77 may not enable us to calculate X n from 
(4.7) with the accuracy we require for large H. If so, we use 

X n = (If+n 7) tan r, = 7 ] (If+ 7 1 )(i+^ + ^ + 1 -^- + 0 (.- n 8 )). (4.8) 

3 J 5 3 i 5 

For negative we write 

J7= - w - - a > o, j8 =log (AjH), z 0 = - iS'. 

If (4.2) and (4.3) are satisfied, we have 

Y n =-H-n, 


(4-9) 



200 


E, M. Wright 


where rj is defined in terms of H and f$ by (4.5)* ^ formulae (4*6), ( 4 - 7 ) 

and (4.8) are still true. Again, we may use (4.6) to calculate X n or 
(4.5), (4.9) and (4.8) to calculate both Y n and X n . 

If we are to calculate rj from a reasonable number of terms of ( 4 -S)» 
it is essential that fi/H be fairly small The conditions (4.2) and (4.4) 
are then certainly satisfied. 


Two Expansions for z(w) Valid in Particular Regions 

5. For small values of | a |, it is well known (Hurwitz and Courant 
1929) that one solution z of (1.1) has the expansion 

* (- 


This series converges for | a | < e~~*. It is easy to deduce that, when u < — I 
and - rr < v < 7T, i.e. when w lies between the two cuts in the ze/-planc, our 
inverse function z(w) has the expansion 

z ( w ) = 2 j ——r~-* (s - 0 

jfei 9n\ 

Near the branch point w 0 = - I + ztt, we write 
z » -1 4 * £, -1<^ 2 , 

so that 

£ + log (1 - £) - (5*A 

where the value of the logarithm is small when £ is small. From (5.1), 
we have 

i+-£+-£*+-£>+...) 

\ 3 2 5 / 

and we may invert this series and deduce that 


z + i = £=' 2 J c m o m , ( 5 . 3 ) 

m-l 

where = ± i . Detailed consideration of the z, w mapping in the neigh¬ 
bourhood of w 0 shows us that we may take c x = i, provided that we take 
that value of 

oo = V (2Z» 0 - 2 w) 

which has a positive imaginary part when w does not lie on the cut ending 
at w 0 . When w lies on the top edge of the cut, co is real and negative; 
when w lies on the bottom edge, co is real and positive. Again, if we 
encircle w 0 twice counter-clockwise we pass from our cut plane into a 
sheet of the Riemann surface in which the branch points are at w 0 and 



Solution of the Equation ze z ~a 


201 


w 0 J r 2 rri , and then back into our original cut-plane. The singularities of 
£ = £(co) with smallest modulus therefore lie at 

co = (277)“ (1 ± i ) 

and so the radius of convergence of (5.3) is 2^/tt. Thus the expansion 
(5.3) is valid for \w-zu 0 \ <277. It is readily verified that the same 
expansion of z + r is valid with 

CO — ’sf ( 2 Z^o “ 2W )i 

the value for which */(co) < o being chosen, provided that | w -w§ | <277. 

We have thus found expansions of z valid near the two branch points. 
It remains to calculate the leading Differentiating (5.2), we have 

1-0 

aco 

and so 



Equating coefficients of zv m , we find that 

m 

hC] : C m -1 (wz ^ 2 ). 

Ar = 1 

Since ^ = 1, we find from this that c 2 = - J and that 

m-l 

C ^ i- . . 

~ ^ 3/* 

*8 - 1 - X 2 " 

From this we can calculate c 3 , c 4 , . . . in succession and we have finally 
or co 8 cu 4 co 5 

£ = — I + CO-1-1-i- 

3 36 270 4320 

to 6 139 a ) 7 CO 8 5 7 ICO 9 

- . . ., (5*4) 

17010 5443200 204120 2351462400 

provided that | co | < 2 

The Near Roots 

6. The method of § 4 suffices to calculate Z n for all but a finite (usually 
quite small) number of values of n. For any particular a, it is easy to 
determine for just which n it is inapplicable (or only applicable with 

excessive labour). For these n , we have to calculate z(jw ). where 

w = log A + (a 4- 2mr)i. 

If w lies between the cuts in the w-plane, and (say) u < ~ 2, we use 



202 


E. M. Wright 


the series (5.1). If w lies near one of the branch points - I ±iri, we use 
(5.4). Otherwise we find a first approximation s 0 to z(zd) and improve it 
as follows. 

We have 

dw 1 dz z 1 

dz z dw x + z 1 4 - s 

and so, to a first approximation, 

8x = {1 - A(x + jc)}8w -jyAS^, (c \ 

hy —jyAS^-f{1 - A(x + #)}8t>, " !/ 

where A _1 = (i + x) 2 +y 2 . If we have an approximation z 0 to Z n we can 
calculate 

S u + iSv = w(Z n ) - w(z^) == log A + ai + 2nrri ~ w(z 0 ) 

and so use (6.1) to determine a correction 8 x + ihy to z 0 . By (1.6) and (3*l), 
we have 

Ss = s 0 (i + z 0 )~ 1 8w +o{(8ze0%(i + s 0 )~ 3 }. 

Hence the repeated use of (6.1) leads to a rapidly convergent sequence of 
approximations, unless z 0 is near -I. In the latter case, w is near one of 
-1 ±iri and (5.4) applies. 

If | w | > 4, but w does not lie between the cuts in the w-plane, we 
can take z 0 ~w - log w, where log w has its principal value. With this 
value of z 0) the next approximation z 1 will be correct to at least one decimal 
place and the subsequent approximations z z , . . . converge rapidly. 

When | w | < 4, but w is not near one of the branch points - l ± 7 ri } 
we can use the table of values of u y v to find a suitable value for a 0 , Since 
z 0 will not be near - 1, the sequence of approximations converges fairly 
rapidly. 

Alternatively we can use drawing to obtain our first approximation. 
Given u y v we have to solve 

x+logr — u, y + 9 ~v } 

where r 2 ~x 2 +y*, tan 8 ~yjx. To solve these graphically, we require 
(i) a sheet of “radial” graph paper (concentric circles and radii) called 
the (x> ^-plane and (ii) a sheet of tracing paper (the (X t F)-plane) on 
which the lines JT= - log k and F= - h are drawn, where k runs through 
the values of the radii of the circles and h through the values of 6 corres¬ 
ponding to the radial lines. We place the origin of the (JST, F)-plane at 
the point ( u , v) on the (x f y)-plane, make the axes parallel and then plot 
on a second sheet of tracing paper (the second (x, jy)-plane) over the first 
the intersections of - log k with r and those of F= - h with 9 = h» 




9 


Proc. Roy. Soc. EdinA] 


*3 

*4 

‘5 

•6 

*7 

*8 

*9 

1*0 

i-5 

2*0 

2*5 


-4*o 

"3*5 

-3*0 

-2-5 

-2*0 

-2*614 

3*142 

-2*247 

3*142 

-1*901 
3*142 

-1-584 

3-142 

-1*307 

3*142 

-2*613 

3*217 

-2*247 

3*213 

-1*901 
3*208 

-1-583 

3-202 

-1*306 

3*192 

-2*612 

3*292 

-2*246 

3-285 

-1*899 

3*275 

-1-581 

3-262 

-1*302 

3*242 

-2*611 

3-367 

-2-244 

3-356 

-1*896 

3*342 

-1-577 

3-322 

- 1*296 
3*293 

-2*609 

3*442 

-2*241 

3*428 

-1-893 
3*409 

-i-57i 

3-383 

- 1*287 
3*344 

-2*606 

3*517 

-*2*237 

3*500 

-1*888 

3*476 

-1-564 

3-444 

-1*277 

3*397 

-2*603 

3*593 

-2*233 

3*572 

-1*882 
3*544 

-1-556 

3-506 

-1 *264 
3*450 

-2*599 

3*668 

-2*228 

3-644 

-1*875 
3*612 

-1-546 

3-569 

-1*249 

3*505 

-2*594 

3744 

- 2*222 

3717 

-1*867 
3*681 

-1-535 

3632 

-1*233 

3*561 

-2*589 

3*820 

-2*215 

3*790 

-1*858 

3750 

-1-523 

3-696 

-1*215 
3*619 

-2*583 

3*897 

-2*208 

3*863 

-1*849 

3*820 

-1-509 

3-76 i 

-1-195 

3*678 

-2-548 

4*283 

-2*163 

4*237 

-1790 

4*178 

-1-430 

4101 

-1*084 

3*998 

-2*502 

4*678 

-2*106 

4*622 

-1718 

4*554 

-1-336 

4-467 

-0*960 

4*356 

-2*449 

5-083 

- 2*041 
5*021 

-1*638 

4*947 

- x-237 
4-856 

-0-836 

4*746 


-1*5 

- 1*0 

-*9 

- *8 

-7 

-1*095 

3*142 

-1*000 
3*142 

-1*005 
3*142 

-1 *023 
3*142 

-1*057 

3*142 

-1*092 
3*175 

-o*995 

3*142 

-0*999 

3*131 

-1*015 
3*117 

-1*047 

3*100 

-1*086 

3*209 

- 0*980 
3*144 

-0*981 

3*123 

-0*993 

3*097 

-1*017 
3*063 

-1*075 

3*244 

-o*957 

3*150 

-0*953 

3*120 

-0*957 

3*083 

- 0*972 
3*037 

-1*060 

3*281 

- 0*926 
3*i6i 

-0-915 

3*123 

- 0-912 
3-078 

-0*915 

3*022 

-1*042 
3*320 

-o*888 

3*178 

-0*871 

3*134 

-0-858 

3-083 

-0*851 

3*021 

- 1*020 
3*36l 

-0*846 

3*201 

-0*821 

3*154 

- 0*800 
3*098 

-0781 

3*033 

-0*996 

3*405 

-o*8oi 

3*231 

- 0*769 
3*i8i 

-0739 

3*123 

-0710 

3*056 

-0*969 

3*452 

-0*753 

3*267 

-0714 

3*215 

-0*677 

3-156 

-0*639 

3*090 

-0*941 

3*501 

-0*703 

3*309 

-0*659 

3*256 

-0*614 

3*197 

-0*569 

3*132 

-0*911 

3*554 

-0*653 

3*356 

-0*603 

3*304 

-0*553 

3-246 

-0*501 

3*182 

-0748 

3*856 

-0*411 

3*659 

-0*341 

3*6ii 

-0*269 

3*56i 

-0*196 

3*507 

-0*584 

4*214 

-0*195 

4*034 

-0*115 

3*994 

-0*033 

3*951 

0*051 

3*907 

-0*430 

4*611 

- 0*009 
4*451 

0*077 

4*416 

0*165 

4-380 

0*254 

4*344 


-*6 

~*5 

-.4 

-*3 

- *2 

-1*111 
3*142 

-1-193 -1-316 

3-142 3-142 

-1*504 

3*142 

-1*809 
3*142 

-1*097 

3*076 

-1*174 

3*044 

-1-286 

2-997 

-1*451 

2*920 

-1*698 

3-778 

-1*058 
3*020 

-1*119 “1*205 
2*961 2*878 

-1*320 

2*754 

-1-463 

2-556 

-o*999 

2*978 

-1*039 

2*901 

-1-093 
2-798 

-1*157 

2*656 

- 1*220 
2*459 

- 0*927 
2*954 

-0*946 -0-970 

2*867 2*756 

-0*993 

2*614 

-1*005 

2*434 

-0*847 

2*947 

-0-847 

2-856 

-0’846 

2-746 

-0*839 

2*611 

-0*819 

2*451 

-0764 

2*956 

-0747 

2-866 

-0*727 

2759 

-0*699 

2*634 

-0*658 

2*493 

-o*681 
2*979 

-0-65I 

2*89I 

-0*6l5 

2*790 

-0*572 

2*676 

-0*517 

2*549 

- 0*600 

3*014 

-0-558 -0-512 

2-929 2-834 

-o*457 

2*730 

-o*393 

2*616 

-0*521 

3*059 

-0-471 

2-978 

-0*415 

2*889 

-0*353 

2*793 

-0*281 

2*689 

-0*446 

3-ni 

-0-388 

3-034 

- 0‘326 
2*951 

-0*257 

2*862 

-0*180 

2*768 

-0*120 

3'45i 

- 0*042 
3*393 

0*040 

3*33i 

0*125 

3*268 

0*214 

3*203 

0*136 

3*862 

0*223 

3'8i6 

0*313 

3*768 

0*404 

3*720 

0*498 

3*670 

0*344 

4*306 

0-436 0-529 
4-268 4-229 

0*623 

4*190 

0719 

4*151 


Table of values of u, v> where u+w 


E. M. Wright 



VOL. LXV 


-•I 

•0 

+ •1 

+ ■2 

+ *3 

+ *4 

+ *5 

+ •6 

+ 7 

+ *8 

+ *9 

+ 1*0 

+ 1-5 

+ 2-0 

+ 2-5 

+ 3 *o 

X 

V 

•2-403 


-2-203 . 

-1-409 

- 0-904 

-0-516 

-0-193 

0-089 

o *343 

0-577 

0-795 

1-000 

1-905 

2-693 

3-416 

4*099 

Y 

*0 

3-142 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


-2-056 

- 2-303 

-1-856 

-1-298 

-0-851 

-0-486 

-0*174 

0-103 

o* 3 S 3 

0-585 

o-8ox 

1*005 

1*908 

2*694 

3-417 

4-099 

•I 

2-456 

1-671 

0-885 

0-564 

0*422 

o *345 

0-297 

0*265 

0-242 

0-224 

0-211 

0-200 

0*167 

0*150 

0-140 

0*133 


1-598 

-1 -609 

-1-398 ■ 

-1-063 

-0-720 

-0-405 

-0-119 

0-142 

0-383 

0-607 

0-819 

1-020 

i* 9 H 

2*698 

3*419 

4-xoi 

•2 

2-234 

i* 77 i 

1-307 

0-985 

0-788 

0-664 

0-581 

0-522 

0-478 

o *445 

0-419 

0-397 

o *333 

0*300 

0*280 

0*267 


-1-251 

-1-204 

-x-051 

-0-820 

-0*557 

-0-293 

-0-039 

0-201 

0-428 

0-643 

0-847 

1*043 

1-925 

2*704 

3*423 

4*104 

*3 

2-193 

1-871 

1-549 

* 1*283 

1-085 

0*944 

0-840 

0-764 

0-705 

0-659 

0-622 

0*591 

o *497 

0-449 

0*419 

0*400 


-0*986 

-0-916 

-0786 

-0-605 

~o -393 

-0-170 

0-054 

0-273 

0-485 

0-688 

0*885 

x -074 

1-940 

2*713 

3*429 

4*107 

*4 

2*2l6 

i* 97 i 

1-726 

1-507 

1-327 

1-185 

1*075 

0-988 

0-919 

0*864 

0-8x8 

0*781 

o-66x 

o *597 

0*559 

o *533 


“O774 

-0-693 

- 0-574 

-0-419 

-0-239 

- 0-046 

0-153 

o *353 

0-549 

0*742 

0*929 

1-112 

1-958 

2*723 

3*436 

4*112 

*5 

2-268 

2*071 

1-873 

1-690 

1-530 

1-396 

1-285 

1*195 

I-I20 

1*059 

1*007 

0*964 

0-822 

0-745 

0-697 

0-665 


-0-597 

-0-511 

-0-397 

-*0-258 

-0-099 

0-073 

0-253 

0-436 

0-6I9 

0-800 

0-979 

I ‘ I 54 

1-980 

2-736 

3*444 

4-118 

•6 

2-336 

2-171 

2-006 

1-849 

1-707 

1-583 

1-476 

1*385 

1*309 

1*244 

1-188 

1-140 

0*981 

0-891 

0-836 

o -797 


- 0*447 

-o -357 

-0-247 

- 0 -XI 7 

0*028 

0-185 

0-349 

0-519 

0-690 

o*86i 

1*031 

i*i 99 

2*004 

2-751 

3*454 

4*125 

*7 

2-413 

2-271 

2-129 

1-992 

x-866 

1-752 

I*65X 

1*562 

1-485 

1-419 

1*361 

I* 3 H 

1*137 

1-037 

0-973 

0-929 


-0-315 

-0*223 

-0-115 

0-007 

0-143 

0*288 

0*442 

0*600 

O76I 

0*923 

1-086 

1-247 

2-031 

2-767 

3-465 

4*133 

•8 

2-495 

2-371 

2-246 

2-126 

2*0x2 

1-907 

1-812 

1-727 

1*652 

1-585 

1-527 

i *475 

1-290 

1-181 

I-IIO 

1-061 


-0-199 

-0-105 

0*001 

0-119 

0-247 

0-385 

0-529 

0*679 

0-83I 

0-986 

1*141 

1-297 

2*059 

2-785 

3*477 

4*142 

*9 

2-581 

2-471 

2-360 

2-252 

2-149 

2-053 

1-964 

1*883 

i-8io 

1-744 

1-685 

1-633 

1-440 

1-323 

1-246 

1-191 


-0-095 

- 0-000 

0-105 

0-220 

o *343 

0-474 

0-6x2 

0-754 

0-899 

1-047 

1*197 

1-347 

2-089 

2-805 

3*491 

4-151 

1-0 

2*670 

2-571 

, 2-471 

2*373 

2*279 

2*190 

2-107 

2-030 

1-960 

1-896 

1*838 

1-785 

1-588 

1-464 

1*381 

1-322 


0-308 

0-405 

0*508 

0*6X4 

0725 

0-840 

0-958 

1-080 

1-204 

I- 33 1 

x -459 

1-589 

2-252 

2-916 

3*570 

4-210 

i *5 

3*137 

3*071, 

3*004 

2-938 

2-873 

2-810 

2-749 

2*690 

2-634 

2-581 

2*530 

2-483 

2*285 

2-144 

2*040 

1-964 


0*594 

0-693 

0-794 

0-898 

1-004 

1*113 

1-223 

1-336 

1-451 

1-567 

1*685 

1*805 

2*416 

3-040 

3*664 

4-282 

2-0 

3*621 

3-571 

3 * 52 i 

3 ‘ 47 i 

3*422 

3*373 

3-326 

3-279 

3-234 

3-190 

3-148 

3*107 

2*927 

2-785 

2*675 

2-588 


0*8x7 

0*916 

1-017 

1-119 

1-223 

1*329 

1-436 

1-544 

1-654 

1-765 

1-877 

1*991 

2-570 

3-164 

3*763 

4-362 

2*5 

4-x 11 

4*071 

4-031 

3 * 99 i 

3*951 

3*912 

3-873 

3-835 

3-798 

3-761 

3725 

3-690 

3*530 

3-396 

3*285 

3-195 



—x+iy+ log (x +' iy ). 


[To face 202 




Solution of the Equation ze % ^a 


203 


These two sets of points lie on two curves whose intersection is the required 
solution in the (x, j/)-plane. The first piece of tracing paper can be 
replaced by suitable markings on the edges of the drawing board and the 
use of a tee-square. 


References to Literature 

Bellman, R., 1949. Ann. Math., Princeton , 50, 347-355. 

Eisenstein, G., 1844. J. Peine Angew. Math., 28, 49-52. 

Euler, L., 1927. Of era Omnia (i), 15 (Leipzig and Berne). 

Hayes, N. D., 1950. /. Land. Math , Soc ., 25, 226-232. 

—— , 1952. Quart.J. Math., 3, 81-90. 

Hurwitz, A., and Courant, R., 1929. Funktioneniheorie (3rd ed., Berlin), 
141-142. 

LJSmeray, E. M., 1896. Nouv. Ann. Math. (3), 15, 548-556 and 16, 54-61. 

— , 1897. Proc. Edin . Math . Sac., 16, 13-35. 

Polossuchin, 0 ., 3:910. Thesis (Zurich) (not seen). 

Schurer, F., 1912. Per. Sacks . Gcs. (Ahad.) WissMath.phys. KL , 64, 167-236. 
— 1913. Ibid., 65, 239-246. 

Wright, E. M,, 1947. Quart. J. Math., *8, 228-235. 

-, 1955. J. Peine Angew, Math., 194, 66-87, csp. 72-74. 

—, 1959. Bull . Atner , Math. Soc ., 65, 89 93. 


{Issued separately October 26, 1959) 




’INSTRUCTIONS TO AUTHORS (continued) 

The overall space available for figures and plates is g|* x 7J' in Transactions and 
7 $* x 4 i * va Proceedings. 

Photographs should be unmounted glossy black and white prints, marked on the back 
with the name of the author, number of the figure, and an indication of the top. Prints should 
not be trimmed or cut out. Authors should suggest the arrangement of the figures on each 
plate, either by a diagram or a paste-up of rough prints, with indication of any lettering to be 
inserted. 

Lettering must be in black ink in neat mid legible style, or lightly written in pencil. 

Indicating Lines and Arrows required on wash drawings or photographs should have 
the point marked by a pin-hole, and must be drawn on the back or on a covering tissue and 
not on the photograph or drawing. On line drawings such lines should be drawn thin but 
firmly. 

Graphs should be drawn on white Bristol board or on squared paper ruled in faint blue 
lines: no other colour ruling is suitable. Since blue lines do not appear in reproductions, any 
cross-lines required must be drawn. If a 1 racing of a graph constructed on an unsuitable grid 
is submitted, white tracing paper should be used. 

Numerical values. In papers concerned with the biological sciences ALL numerical 
values must be accompanied by their decimal equivalents. 

Tables. Tabular matter must be kept to a minimum. No paper should present both 
tables and corresponding graphs, except where good reason can be given. 

Duplicates. When the author can conveniently do so, it is desirable that duplicates of 
illustrations should be supplied for the use of the referees. 

Proofs of papers will be sent to authors or communicators to the addresses indicated in 
correspondence or on MSS. The cost of authors' corrections in excess of 5 per cent, of the 
printer’s charges for the setting of a particular paper will be charged to the authors. Proofs 
should, if possible, be returned within one week to The Secretary, Royal Society of Edinburgh, 
23 George Street, Edinburgh, a, and not to the printer . To prevent delay, authors who are 
abroad should, u possible, appoint someone in this country to correct proofs. 

Offprints. As soon as a Transactions paper, or the sheet in which the last part of a 
Proceedings paper appears, is ready for press, copies of offprints, in covers bearing the title of 
the paper and the name of the author, are printed and placed on sale. The date of such 
separate publication is printed on each paper. Authors of papers will receive twenty-five off¬ 
prints free, if they so desire. Additional offprints may be obtained at a fixed scale of prices given 
on a form attached to the first proof. To prevent disappointment, especially if a paper con¬ 
tains plates, authors should, as soon as possible, notify the Secretary of the number of 
additional copies required. 

Indexes. To facilitate the compilation of indexes, and to secure due attention bring 
given to the important points of a paper in Catalogues of Scientific literature, authors are 
requested to supply to the Secretary, with the final proofs of papers, a brief index (on the 
model given below) of the points in a paper which axe considered new or important. Indexes 
will be edited by the Secretary for publication in each volume of the fmtsadHons and 
Proceedings 




MODEL INDEX 

Honstoun, SR. A.—A measurement of the velocity of light Pm, Roy, See, Edin., A, 63, 95-104. 

Light, A measurement of the velocity of. . 

R» A, Heustoun. Pm, Rty, See. mm% A, $3,194^50. 95 -k*. , f , 

Velocity of light A measurawmt of the. ' » * i 

• A. Houstmuu Pm, Roy. See. Ed**,, A, %, 

Authors axe advised, Wore sending 

Papers & b« to tht Ray Hi 

dealing with symbols, signs, and abbre ‘ 

A W #»3r,b0 had 4 









Indian Agricultural Research Institute 
LIBRARY 
NEW DELHI-12 


This book can be issued on or after 


Return Date I Return Date I Return Date I Return Date 






